datasets

Just Put It On a Map

20 Mar 2026

progressandpoverty.substack.com

An underrated strategy for urbanist persuasion, powered by open source tools

Alan Lomax's Massive Music Archive Is Online: Features 20,000 Historic Blues & Folk Recordings

19 Mar 2026

openculture.com

A huge treasure trove of songs and interviews recorded by the legendary folklorist Alan Lomax from the 1940s into the 1990s has been digitized and made available online for free listening.

Electronic Texts of H.P. Lovecraft's Works

2 Mar 2026

hplovecraft.com

Paper page - A Very Big Video Reasoning Suite

25 Feb 2026

huggingface.co

Join the discussion on this paper page

Who are the Top 100 Landowners in the US? | Land Report 100

9 Feb 2026

landreport.com

Land Report 100 Explore the 2025 Top 100 U.S. Landowners. View Top 100 Presented by: Land Report 100 Explore the 2025 Top 100 U.S. Landowners. View Top 100 Presented by: Land Report 100 Who is America's Largest Landowner? This question is the quest

How to Design Production-Grade Mock Data Pipelines Using Polyfactory with Dataclasses, Pydantic, Attrs, and Nested Models - MarkTechPost

8 Feb 2026

marktechpost.com

How to Design Production-Grade Mock Data Pipelines Using Polyfactory with Dataclasses, Pydantic, Attrs, and Nested Models

Stream 4,000+ Public Domain Movies on WikiFlix: Silent Classics, Academy Award-Winners, Hitchcock Films & More

13 Jan 2026

openculture.com

Humanity was already enjoying motion pictures a century ago. But the ability to do so at home still lay a few decades in the future, and the ability to pull up a movie on demand through a streaming service much further still.

With Spotify’s Library Plundered, the Door Is Open for Music Preservation, but Also for AI Companies

22 Dec 2025

vice.com

Spotify's library was scraped in the name of music preservation, but will this make illegally training AI even easier?

Backing up Spotify

21 Dec 2025

annas-archive.li

We backed up Spotify (metadata and music files). It’s distributed in bulk torrents (~300TB). It’s the world’s first “preservation archive” for music which is fully open (meaning it can easily be mirrored by anyone with enough disk space), with 86 million music files, representing around 99.6% of listens.

Getting Started with OpenBB: The Ultimate Python Toolkit for Financial and Economic Data

20 Dec 2025

open.substack.com

Streamline your financial data workflow with OpenBB. Learn setup, data extraction, and automation using Python—perfect for analysts and economists.

https://www.datacentermap.com/

15 Dec 2025

datacentermap.com

GlobalBuildingAtlas LoD1

12 Dec 2025

tubvsig-so2sat-vm1.srv.mwn.de

College Sports Finances Database

25 Nov 2025

sportico.com

Sportico is maintaining an interactive, real-time database that tracks the official balance sheets of public FBS university athletic departments.

User:Birdman86 - Wikimedia Commons

22 Oct 2025

commons.wikimedia.org

Browse Catalog | LibriVox

22 Oct 2025

librivox.org

LibriVox

65 Essential Children’s Books

15 Oct 2025

theatlantic.com

Illustrated titles that teach kids to love literature

Advanced Datasets for AI & ML Projects

31 Aug 2025

amanxai.com

In this article, I'll take you through a list of 20 advanced datasets you should try to build your next AI & ML projects.

Bridging the Gap: New Datasets Push Recommender Research Toward Real-World Scale - KDnuggets

13 Jul 2025

kdnuggets.com

Publicly available datasets in recommender research currently shaping the field.

International Postal & Zip Code Database

5 May 2025

geopostcodes.com

GeoPostcodes provides the world’s most comprehensive postal/zip code database. Complete, accurate, always up-to-date and enterprise-ready.

Statology Sprint: Advanced Synthetic Data Generation with Faker

18 Apr 2025

statology.org

This Statology Sprint brings together our most valuable content on Faker, Python's powerful synthetic data generation library, to help you create realistic, privacy-compliant test data for your projects.

I Analyzed Chord Progressions in 680k Songs

18 Apr 2025

cantgetmuchhigher.com

And the results surprised me

OpenTimes

18 Mar 2025

simonwillison.net

Spectacular new open geospatial project by [Dan Snow](https://sno.ws/): > OpenTimes is a database of pre-computed, point-to-point travel times between United States Census geographies. It lets you download bulk travel time …

Visualizing all the books in the world

26 Feb 2025

flowingdata.com

To show a catalog of almost 100 million books in one view, phiresky mapped them based on International Standard Book Numbers, or ISBNs, with an interactive visualization.

US Counties Database | Simplemaps.com

29 Dec 2024

simplemaps.com

Free and commercial U.S. counties databases. Includes latitude, longitude, population, largest city, zip codes, timezone, income and more. CSV, SQL and Excel format.

These stunning images trace ships’ routes as they move

25 Dec 2024

technologyreview.com

Publicly available data helps monitor ship traffic to avoid disruption of undersea internet cables, identify whale strikes, and study the footprint of underwater noise.

Harvard and Google to release 1 million public-domain books as AI training dataset | TechCrunch

12 Dec 2024

techcrunch.com

AI training data has a big price tag, one best-suited for deep-pocketed tech firms. This is why Harvard University plans to release a dataset that

131M American Buildings

6 Nov 2024

tech.marksblogg.com

Benchmarks & Tips for Big Data, Hadoop, AWS, Google Cloud, PostgreSQL, Spark, Python & More...

Free: Download Over 33000 Sounds from the BBC Sound Effects Archive

19 Oct 2024

openculture.com

There may be a few young people in Britain today who recognize the name Ludwig Koch, but in the nineteen-forties, he constituted something of a cultural phenomenon unto himself.

List of U.S. stadiums by capacity

2 Aug 2024

en.m.wikipedia.org

The following is a list of stadiums in the United States. They are ranked by capacity, which is the maximum number of spectators the stadium can normally accommodate. All U.S. stadiums with a current capacity of 10,000 or more are included in the list. The majority of these stadiums are used for American football, either in college football or the NFL. Most of the others are Major League Baseball ballparks or Major League Soccer stadiums.Rows shaded in yellow indicates stadium is home to an NFL, MLB, MLS, or NWSL franchise.

A Beginner’s Guide to Identifying Explosive Ordnance in Social Media Imager

1 Aug 2024

bellingcat.com

Learn to identify some of the more common types of unexploded ordnance (UXO) in online images using open-source tools and resources.

The pile dataset has become Big Tech’s secret spice

31 Jul 2024

dataconomy.com

The pile dataset has become a hot topic in AI circles, sparking debates about how data is used and the

Little-Known Tool Is Giving Instant Access To Vast Amounts of Homebuyer Dat

28 Jul 2024

yro.slashdot.org

Comprehensive Guide to Datasets and Dataloaders in PyTorch

22 Jun 2024

towardsdatascience.com

The full guide to creating custom datasets and dataloaders for different models in PyTorch

DMA® Regions | Nielsen

22 Jun 2024

nielsen.com

DMA (Designated Market Area) regions are the geographic areas and zip codes in the U.S. in which local television viewing is measured by Nielsen.

Carabiner Collection

20 Jun 2024

carabinercollection.com

Plastic Properties Table

12 Jun 2024

curbellplastics.com

Use our plastics properties table to sort and compare plastic materials. Review typical, physical, thermal, optical, electrical properties. Ask an Expert or Get a Quote.

The solar industrial revolution is the biggest investment opportunity in hi

12 Jun 2024

caseyhandmer.wordpress.com

Solar is in the process of shearing off the base of the entire global industrial stack – energy – and the tech sector still lacks a unified thesis for how to best enable, accelerate, an…

Download Issues of “Weird Tales” (1923–1954): The Pioneering Pulp Horror Ma

12 Jun 2024

openculture.com

We live in an era of genre. Browse through TV shows of the last decade to see what I mean: Horror, sci-fi, fantasy, superheroes, futuristic dystopias…. Take a casual glance at the burgeoning global film franchises or merchandising empires.

What is Dataset Distillation Learning? A Comprehensive Overview

11 Jun 2024

marktechpost.com

Dataset distillation is an innovative approach that addresses the challenges posed by the ever-growing size of datasets in machine learning. This technique focuses on creating a compact, synthetic dataset that encapsulates the essential information of a larger dataset, enabling efficient and effective model training. Despite its promise, the intricacies of how distilled data retains its utility and information content have yet to be fully understood. Let’s delve into the fundamental aspects of dataset distillation, exploring its mechanisms, advantages, and limitations. Dataset distillation aims to overcome the limitations of large datasets by generating a smaller, information-dense dataset. Traditional data compression methods

18 Data Profiling Tools Every Developer Must Know

5 Jun 2024

marktechpost.com

Analytics, management, and business intelligence (BI) procedures, such as data cleansing, transformation, and decision-making, rely on data profiling. Content and quality reviews are becoming more important as data sets grow in size and variety of sources. In addition, organizations that rely on data must prioritize data quality review. Analysts and developers can enhance business operations by analyzing the dataset and drawing significant insights from it. Data profiling is a crucial tool. For evaluating data quality. It entails analyzing, cleansing, transforming, and modeling data to find valuable information, improve data quality, and assist in better decision-making, What is Data Profiling? Examining

COCONut: A High-Quality, Large-Scale Dataset for Next-Gen Segmentation Mode

23 Apr 2024

marktechpost.com

Computer vision has advanced significantly in recent decades, thanks in large part to comprehensive benchmark datasets like COCO. However, nearly a decade after its introduction, COCO's suitability as a benchmark for modern AI models is being questioned. Its annotations may contain biases and nuances reflecting the early stages of computer vision research. With model performance plateauing on COCO, there are concerns about overfitting to the dataset's specific characteristics, potentially limiting real-world applicability. To modernize COCO segmentation, researchers have proposed COCONut - a novel, large-scale universal segmentation dataset in this paper. Unlike previous attempts at creating large datasets that often compromised

Home - OpenCorporates

13 Apr 2024

opencorporates.com

Spotify API: How To Create a Data Set of Songs

19 Mar 2024

dev.to

A Fun Tutorial using Python, JSON, and Spotify API! You might find it more comfortable...

3D Images of Over 13,000 Museum Specimens Now Free To Everyone

12 Mar 2024

science.slashdot.org

City-Data.com - Stats about all US cities - real estate, relocation info, c

17 Feb 2024

city-data.com

Stats about all US cities - real estate, relocation info, crime, house prices, schools, races, income, photos, sex offenders, maps, education, weather, home value estimator, recent sales, etc.

How to detect poisoned data in machine learning datasets

15 Feb 2024

venturebeat.com

A proactive, coordinated effort can reduce the chances that manipulations will impact model performance and protect algorithmic integrity.

Beautiful Free Images & Pictures | Unsplash

10 Feb 2024

unsplash.com

Beautiful, free images and photos that you can download and use for any project. Better than any royalty free or stock photos.

Welcome to Open Library | Open Library

10 Feb 2024

openlibrary.org

Open Library is an open, editable library catalog, building towards a web page for every book ever published. Read, borrow, and discover more than 3M books for free.

Why is the Pile a good benchmark?

4 Feb 2024

pile.eleuther.ai

The Pile is a 825 GiB diverse, open source language modelling data set that consists of 22 smaller, high-quality datasets combined together.

Sports Data Stuff | CFB PBP

23 Jan 2024

sportsdatastuff.com

College Football Play-By-Play Data

‘Let’s Go Shopping (LGS)’ Dataset: A Large-Scale Public Dataset with 15M Im

17 Jan 2024

marktechpost.com

Developing large-scale datasets has been critical in computer vision and natural language processing. These datasets, rich in visual and textual information, are fundamental to developing algorithms capable of understanding and interpreting images. They serve as the backbone for enhancing machine learning models, particularly those tasked with deciphering the complex interplay between visual elements in images and their corresponding textual descriptions. A significant challenge in this field is the need for large-scale, accurately annotated datasets. These are essential for training models but are often not publicly accessible, limiting the scope of research and development. The ImageNet and OpenImages datasets, containing human-annotated

Analysis of 1,800 AI datasets: ~70% didn't state what license should be use

28 Oct 2023

techmeme.com

Nitasha Tiku / Washington Post: Analysis of 1,800 AI datasets: ~70% didn't state what license should be used or had been mislabeled with more permissive guidelines than their creators intended

Meet SwimXYZ: A Synthetic Dataset of Swimming Motions and Videos Containing

20 Oct 2023

marktechpost.com

Human motion capture has emerged as a key tool in various industries, including sports, medical, and character animation for the entertainment sector. Motion capture is utilized in sports for multiple purposes, including injury prevention, injury analysis, video game industry animations, and even generating informative visualization for TV broadcasters. Traditional motion capture systems provide solid results in the majority of circumstances. Still, they are expensive and time-consuming to set up, calibrate, and post-process, making them difficult to utilize on a broad scale. These concerns are made worse for aquatic activities like swimming, which bring up unique problems such as marker reflections

Download 222 Belle Époque Art Posters: An Online Archive of Masterpieces fr

20 Oct 2023

openculture.com

Europe at the end of the nineteenth century and beginning of the twentieth: what a time and place to be alive.

Library of Short Stories

25 Sep 2023

libraryofshortstories.com

Read For Free, Anywhere, Anytime. An online library of over 1000 classic short stories. H. G. Wells, Edgar Allan Poe, H. P. Lovecraft, Anton Chekhov, Beatrix Potter.

Dirty Secrets of BookCorpus, a Key Dataset in Machine Learning

21 Sep 2023

towardsdatascience.com

BookCorpus has helped train at least thirty influential language models (including Google’s BERT, OpenAI’s GPT, and Amazon’s Bort), according to HuggingFace. This is the research question that…

MIT Researchers Created a New Annotated Synthetic Dataset of Images that De

17 Sep 2023

marktechpost.com

Large-scale pre-trained Vision and language models have demonstrated remarkable performance in numerous applications, allowing for the replacement of a fixed set of supported classes with zero-shot open vocabulary reasoning over (nearly arbitrary) natural language queries. However, recent research has revealed a fundamental flaw in these models. For instance, their inability to comprehend Visual Language Concepts (VLC) that extend 'beyond nouns,' such as the meaning of non-object words (e.g., attributes, actions, relations, states, etc.), or their difficulty with compositional reasoning, such as comprehending the significance of the word order in a sentence. Vision and language models, powerful machine-learning algorithms that learn

the-markup/xandr-audience-segments

30 Jul 2023

github.com

Inside the secret list of websites that make AI like ChatGPT sound smart

19 Apr 2023

washingtonpost.com

An analysis of a chatbot data set by The Washington Post reveals the proprietary, personal, and often offensive websites that go into an AI’s training data.

How to Anonymise Places in Python

18 Dec 2022

towardsdatascience.com

A ready-to-run code which identifies and anonymises places, based on the GeoNames database

11 Less Used but Important Plots for Data Science

23 Nov 2022

towardsdatascience.com

Some Unique Data Visualization Techniques for Getting High-Level Insight into the Data

The Data Cards Playbook: A Toolkit for Transparency in Dataset Documentatio

21 Nov 2022

ai.googleblog.com

Posted by Mahima Pushkarna, Senior Interaction Designer, and Andrew Zaldivar, Senior Developer Relations Engineer, Google Research As machine learn...

NCAA Statistics

20 Nov 2022

stats.ncaa.org

Create geo image dataset in 20 minutes

14 Oct 2022

towardsdatascience.com

Build geo specific subset of LAION-5B

Huge new dataset pushes limits of neuroscience

4 Oct 2022

wired.com

The Allen Institute’s release includes recordings from a whopping 300,000 mouse neurons. Now the challenge is figuring out what to do with all that data.

National Rail Network Map

14 Sep 2022

arcgis.com

Home page | College Athletics Database

10 Sep 2022

knightnewhousedata.org

Introduction to The World of Data - (OLTP, OLAP, Data Warehouses, Data Lake

17 Aug 2022

cloudnatively.com

https://codingvc.com/the-value-of-data-part-1-using-data-as-a-competitive-advantage

18 Jul 2022

codingvc.com

10 Data Acquisition Strategies for Startups

11 Jul 2022

medium.com

The “unreasonable effectiveness” of data for machine-learning applications has been widely debated over the years (see here, here and…

The ArtBench Dataset: Benchmarking Generative Models with Artworks

6 Jul 2022

substack.com

We introduce ArtBench-10, the first class-balanced, high-quality, cleanly annotated, and standardized dataset for benchmarking artwork generation. It comprises 60,000 images of artwork from 10...

https://codingvc.com/the-value-of-data-part-2-building-valuable-datasets

5 Jul 2022

codingvc.com

Mapping Urban Trees Across North America with the Auto Arborist Dataset

5 Jul 2022

substack.com

Posted by Sara Beery, Student Researcher, and Jonathan Huang, Research Scientist, Google Research, Perception Team Over four billion people live in...

https://codingvc.com/the-value-of-data-part-3-data-business-models/

28 Jun 2022

codingvc.com

AllMusic: The Story of the Big Data Jukebox

25 Jun 2022

tedium.co

When AllMusic launched 25 years ago, it wasn't an obvious big data play. But it became one. Hidden in its millions of entries is music's collective history.

The market for synthetic data is bigger than you think

1 Jun 2022

techcrunch.com

To understand what's happening, but also what's coming if synthetic data does get more broadly adopted, we talked to various CEOs and VCs over the last few months.

93 Datasets That Load With A Single Line of Code

13 May 2022

towardsdatascience.com

How you can pull one of a few dozen example political, sporting, education, and other frames on-the-fly.

Zip Code Database Lookup | Everything By Zip Code

3 May 2022

everythingbyzipcode.com

Our zip code database is a unified view of public datasets like the Census, American Community Survey, Bureau of Labor Statistics and the CDC, spanning 800+ data points, also offering a free zip code database.

Indus

11 Feb 2022

user.tu-berlin.de

How to Create Fake Data with Faker

17 Jan 2022

towardsdatascience.com

You can either Collect Data or Create your Own Data

Curating a Dataset from Raw Images and Videos

16 Jan 2022

link.medium.com

Best-practices to follow when building datasets from large pools of image and video data and tools that make it straightforward.

MedMNIST v2 Dataset | Papers With Code

29 Oct 2021

paperswithcode.com

MedMNIST v2 is a large-scale MNIST-like collection of standardized biomedical images, including 12 datasets for 2D and 6 datasets for 3D. All images are pre-processed into 28 x 28 (2D) or 28 x 28 x 28 (3D) with the corresponding classification labels, so that no background knowledge is required for users. Covering primary data modalities in biomedical images, MedMNIST v2 is designed to perform classification on lightweight 2D and 3D images with various data scales (from 100 to 100,000) and diverse tasks (binary/multi-class, ordinal regression and multi-label). The resulting dataset, consisting of 708,069 2D images and 10,214 3D images in total, could support numerous research / educational purposes in biomedical image analysis, computer vision and machine learning. Description and image from: MedMNIST v2: A Large-Scale Lightweight Benchmark for 2D and 3D Biomedical Image Classification Each subset keeps the same license as that of the source dataset. Please also cite the corresponding paper of source data if you use any subset of MedMNIST.

LightOn Released FC-AMF-OCR Dataset: A 9.3 Million Images Dataset of Financial Documents with Full O

24 Sep 2021

marktechpost.com

The release of the FC-AMF-OCR Dataset by LightOn marks a significant milestone in optical character recognition (OCR) and machine learning. This dataset is a technical achievement and a cornerstone for future research in artificial intelligence (AI) and computer vision. Introducing such a dataset opens up new possibilities for researchers and developers, allowing them to improve OCR models, which are essential in converting images of text into machine-readable text formats. Background of LightOn and FC-AMF-OCR Dataset LightOn, a company recognized for its pioneering contributions to AI and machine learning, has continuously pushed the boundaries of technology. The FC-AMF-OCR Dataset is one

Laion-400M: open-source dataset of 400M image-text pairs

14 Sep 2021

laion.ai

DatasetGAN

24 May 2021

nv-tlabs.github.io

Awesome list of datasets in 100 categories

20 May 2021

kdnuggets.com

With an estimated 44 zettabytes of data in existence in our digital world today and approximately 2.5 quintillion bytes of new data generated daily, there is a lot of data out there you could tap into for your data science projects. It's pretty hard to curate through such a massive…

Web Scraping to Create a Dataset using Python

18 May 2021

thecleverprogrammer.com

In this article, I'm going to walk you through a tutorial on web scraping to create a dataset using Python and BeautifulSoup.

Datasets should behave like git repositories | by Simon Lousky | Towards Da

5 May 2021

towardsdatascience.com

Create, maintain, and contribute to a long-living dataset that will update itself automatically across projects.

UCI Machine Learning Repository

3 Apr 2021

archive.ics.uci.edu

Discover datasets around the world!

www-eio.upc.edu/~pau/cms/rdata/datasets.html

23 Feb 2021

www-eio.upc.edu

UMLS Metathesaurus Browser

14 Jan 2021

uts.nlm.nih.gov

This is an interface for searching and browsing the UMLS Metathesaurus data. Our goal here is to present the UMLS Metathesaurus data in a useful way.

The Child Affective Facial Expression (CAFE) set: validity and reliability

18 Dec 2020

ncbi.nlm.nih.gov

Emotional development is one of the largest and most productive areas of psychological research. For decades, researchers have been fascinated by how humans respond to, detect, and interpret emotional facial expressions. Much of the research in this area ...

Explore Census Data

30 Nov 2020

data.census.gov

judicial search

29 Nov 2020

judyrecords.com

Instantly search 740 million+ United States court cases.

Why Data Standards Matter

3 Nov 2020

safegraph.com

The power of join keys and how data standards can make data more valuable and accelerate collaboration and innovation. This is the second installment of the DaaS Bible series.

The history of autonomous vehicle datasets and 3 open-source Python apps fo

16 Oct 2020

r-bloggers.com

Special thanks to Plotly investor, NVIDIA, for their help in reviewing these open-source Dash applications for autonomous vehicle R&D, and Lyft for initial data visualization development in Plotly. Author: Xing Han Lu, @xhlulu (originally posted on Medium) ???? To learn more about how to use Dash for Autonomous Vehicle and AI Applications register for our live webinar with […]

Download GNIS Data | U.S. Geological Survey

10 Aug 2020

usgs.gov

Learn about and download U.S. Board on Geographic Names data from the Geographic Names Information System (GNIS)

Automated Data Import with Python

1 Jun 2020

towardsdatascience.com

A different approach to import data files automatically in python.

Generating Synthetic Patient Data

1 Jun 2020

towardsdatascience.com

A quick look at using Synthea

Time Series Analysis: Creating Synthetic Datasets

17 May 2020

towardsdatascience.com

How to create time series datasets with different patterns

Millions of tiny databases

9 Mar 2020

blog.acolyer.org

The Big Bad NLP Database: Access Nearly 300 Datasets

9 Mar 2020

kdnuggets.com

Check out this database of nearly 300 freely-accessible NLP datasets, curated from around the internet.

Slashdot

26 Feb 2020

tech.slashdot.org

A Directory of High Quality, Real-Time Event Sources

19 Feb 2020

github.com

Connect APIs, remarkably fast. Free for developers. - PipedreamHQ/pipedream

5 Data Cleansing Tools - DataScienceCentral.com

19 Feb 2020

datasciencecentral.com

You need to analyze data to make more informed decisions. There are many tools to help you analyze the data visually or statistically, but they only work if the data is already clean and consistent. Here is the list of 5 data cleansing Tools. Drake Drake is a simple-to-use, extensible, text-based data workflow tool that… Read More »5 Data Cleansing Tools

Rahul Agarwal on LinkedIn: #datavisualization #awesomevisualization #seaborn #python

19 Feb 2020

linkedin.com

"Enter into picture Swarmplots, just like their name." https://lttr.ai/MJtZ #datavisualization #awesomevisualization #seaborn #python

The 5 most useful Techniques to Handle Imbalanced datasets

19 Feb 2020

mlwhiz.com

This post is about explaining the various techniques you can use to handle imbalanced datasets

Data USA

19 Feb 2020

datausa.io

The most comprehensive visualization of U.S. public data. Data USA provides an open, easy-to-use platform that turns data into knowledge.

14 Paris Museums Put 300,000 Works of Art Online: Download Classics by Mone

10 Jan 2020

openculture.com

First trips to Paris all run the same risk: that of the museums consuming all of one's time in the city. What those new to Paris need is a museum-going strategy, not that one size will fit all.

vumaasha/Atlas: Atlas: A Dataset and Benchmark for E-commerce Clothing Prod

23 Dec 2019

github.com

Atlas: A Dataset and Benchmark for E-commerce Clothing Product Categorization - vumaasha/Atlas

This object-recognition dataset stumped the world’s best computer vision mo

11 Dec 2019

news.mit.edu

When computer vision detectors are turned loose in the real world, their performance noticeably drops. In an effort to close this performance gap, a team of MIT and IBM researchers set out to create a very different kind of object-recognition dataset called ObjectNet.

Welcome! | Million Song Dataset

30 Oct 2019

millionsongdataset.com