Line Learning Data
Up to 80% of a Machine Learning project involves data collection: What data is needed? What data is available? How to select the data? How to collect the data? How to clean the data? How to prepare the data? How to use the data?
Up to 80% of a Machine Learning project involves data collection:
- What data is needed?
- What data is available?
- How do I select the data?
- How do we collect data?
- How do I clean up data?
- How do I prepare the data?
- How can we use the data?
What is data?
Data can be many things. In Machine Learning, data is a collection of events:
| Type | For example |
|---|---|
| Number | Price. Date. |
| Size | Dimensions. Height. Weight. |
| Vocabulary | Names and places. |
| Observe | Count the cars. |
| Describe | It's cold. |
Intelligence needs data.
Human intelligence needs data: A real estate agent needs data on homes that have been sold to estimate prices.
Artificial intelligence also needs data: A machine learning program needs data to estimate prices.
- Data can help us see and understand.
- Data can help us identify new opportunities.
- Data can help us resolve misunderstandings.
Healthcare
The healthcare and life sciences industries collect public health data and patient data to learn how to improve patient care and save lives.
Business
The most successful companies in many fields are data-driven. They use sophisticated data analytics to understand how the company can perform better.
Finance
Banks and insurance companies collect and evaluate data on customers, loans, and deposits to support strategic decision-making.
Data storage
The most common data collected are Numbers and Sizes. Typically, this data is stored in arrays that represent the relationships between the values.
This table shows house prices compared to area:
| Price | 7 | 8 | 8 | 9 | 9 | 9 | 10 | 11 | 14 | 14 | 15 |
| Size | 50 | 60 | 70 | 80 | 90 | 100 | 110 | 120 | 130 | 140 | 150 |
Quantitative data versus qualitative data
Quantitative data is numerical data:
- 55 cars
- 15 meters
- 35 children
Qualitative data is descriptive data:
- It's cold.
- It's long
- That's fun!
Inventory or sampling
Inventory is when we collect data for every member of a team.
Sampling is when we collect data for a number of members of a group.
If we want to know how many Americans smoke, we can ask everyone in America (county), or we can ask 10,000 people (sampling).
Accurate inventory is difficult to implement. Sampling is inaccurate but easier to implement.
Sampling terminology
Population is a group of individuals (subjects) from whom we want to collect information.
An inventory is information about every individual within a group of people being surveyed.
Sampling is information about a portion of the total population surveyed (to represent the whole).
Random sample
For a sample to be representative of the total number of people surveyed, it must be collected randomly.
A random sample is a sample in which each member of the total number of people surveyed has an equal chance of appearing in the sample.
Sampling error
Sampling bias (error) occurs when samples are collected in a way that makes some individuals less (or more) likely to be included in the sample than others are.
Big data
Big data is data that humans cannot process without the assistance of advanced machines.
Big data doesn't have a specific size definition, but datasets are constantly getting larger as we continuously collect more data and store it at increasingly lower costs.
Data mining
Big data comes with complex data structures.
A large part of the Big Data processing process involves data refinement.
You've just finished reading the article "Line Learning Data" edited by the TipsMake team. You can save line-learning-data-kfjxm.pdf to your computer here to read later or print it out. We hope this article has provided you with many useful tech tips and tricks. You can search for similar articles on tips and guides. Thank you for reading and for following us regularly.
- MIT AI model can capture the relationship between objects with the minimum amount of training data
- A compilation of the best free websites for learning data analysis.
- Linear graphs in Machine Learning
- The best Python tools for Machine Learning and Data Science
- Learn Machine Learning
- How to draw a line chart in Excel
- Some tricks or LINE users should know
- What is goal-line in football that can make accurate decisions instead of referees?
- Guide to learning English on ScratchJR
- Google released the TensorFlow machine learning framework specifically for graphical data
- Google researchers for gaming AI to improve enhanced learning ability
- AI uses WiFi data to estimate the number of people in a room
- If AI can do everything, why do we still need to learn?
- How to add or delete friends on LINE?