⬆️ ⬇️

Selection of working data processing examples

Hello, reader.



In the footsteps of my first post of a dataset selection for machine learning - I will do a selection of relatively recent datasets with working data processing examples. After all, it is no secret to anyone that learning with good examples is more effective and faster. Let's see what interesting we can show some of the best examples of data processing.



The scheme of work with the current post will be inherited from my post about the best notebooks for ML and DS , namely, I saved it in bookmarks → I passed it to my colleague.

')

+ bonus at the end of the article - a steep course from FPMI MIPT.



image



So let's get started.



Selection of datasets with working examples of data processing:



Suicide Rates Overview 1985 to 2016 - comparing socio-economic information with suicide rates by year and country.



Processing examples:





Spotify's Worldwide Daily Song Ranking - The daily rating of the 200 most listened songs in 53 countries from 2017 and 2018 by Spotify users.



Processing Example:





Crimes in Boston - records from the Boston crime incident reporting system, which includes incidents and information about when and where it occurred.



Processing Example:





Google Play Store Apps - categories, ratings, size of all Google Play apps.



Processing Example:





Pokémon for Data Mining and Machine Learning - statistics and features of Pokemon;



Processing Example:





A Million News Headlines - data headlines news published over the past 15 years.



Processing Example:





Airplane Crashes Since 1908 - the full history of air crashes around the world, from 1908 to the present.



Processing Example:





News Headlines Dataset For Sarcasm Detection is a high-quality data set for the sarcasm detection task.



Processing Example:





Historical Air Quality - air quality data collected on outdoor monitors throughout the United States.



Processing Example:





Nutrition Facts for McDonald's Menu - nutritional analysis of each menu item in McDonald's USA.



Processing Example:





LEGO Database - parts / sets / colors and stocks of each official LEGO set in the Rebrickable database.



Processing Example:





Global Commodity Trade Statistics - import and export volumes for 5,000 products in most countries of the world over the past 30 years.



Processing Example:





Crime in India - complete information on various aspects of crimes committed in India since 2001.



Processing Example:





Predicting a Pulsar Star - data on pulsars collected during a survey of the Universe.



Processing examples:





French employment, salaries, population per town - data showing equality and inequality in France.



Processing Example:





United States Census - US census data.



Processing Example:





California Housing Prices - the price of housing in California.



Processing Example:





US Unemployment Rate by County, 1990-2016 - United States Department of Labor unemployment data.



Processing Example:





World of Warcraft Avatar History - a set of records that detail information about the player's characters in the game over time.



Processing Example:





The Gravitational Waves Discovery Data - data on gravitational wave events GW150914.



Processing Example:





Bonus!



And as a bonus, today we will have an excellent Deep Learning course designed for high school students who are interested in programming and mathematics, as well as students who want to begin to engage in deep learning.



The purpose of the course is to introduce the basic principles of deep learning (neural networks) in an interactive format and by the example of practical tasks.



Course program



  1. Python: Basics, Google Colab;
  2. Introduction to linear algebra. Vectors Matrices and operations with them. NumPy library;
  3. Pandas and MatPlotlib libraries. Basics of machine learning;
  4. Elements of the theory of optimization. Gradient. Gradient descent. Linear models;
  5. Introduction to deep learning. Perceptron. A neuron with a sigmoid (and other activation functions). Basics of OOP in Python;
  6. PyTorch library. Multilayered neural networks;
  7. Training neural networks in practice. Cifar10, notMNIST;
  8. Convolutional neural networks. Convolutional layer Pooling layer;
  9. Practice learning neural networks. Classification of road signs;
  10. Transfer Dearning. Popular in Computer Vision architecture;
  11. Image segmentation. U-Net;
  12. Participation in competitions at Kaggle;
  13. Object Detection. YOLOv3;
  14. Classic GAN. Neural style transfer;
  15. Basic text processing techniques;
  16. Word Embeddings;
  17. Recurrent neural networks;
  18. LSTM, GRU cells;
  19. Language models;
  20. Machine translate;
  21. Text2Speech;
  22. SuperResolution.


You can also take a look at the Deep Learning School's Youtube channel . There are many great videos;)



This concludes our short selection of data processing examples. I hope you learned something new for yourself. As is customary on Habré, I liked the post - put a plus. Do not forget to share with colleagues. Also, if you have something that you can share yourself - write in the comments. More information about machine learning and Data Science on Habré and in the Neuron telegram channel (@neurondata).



All knowledge!

Source: https://habr.com/ru/post/460557/



All Articles