Line Learning Data

Up to 80% of a Machine Learning project involves data collection: What data is needed? What data is available? How to select the data? How to collect the data? How to clean the data? How to prepare the data? How to use the data?

Up to 80% of a Machine Learning project involves data collection:

  • What data is needed?
  • What data is available?
  • How do I select the data?
  • How do we collect data?
  • How do I clean up data?
  • How do I prepare the data?
  • How can we use the data?

What is data?

Data can be many things. In Machine Learning, data is a collection of events:

 

Intelligence needs data.

Human intelligence needs data: A real estate agent needs data on homes that have been sold to estimate prices.

Artificial intelligence also needs data: A machine learning program needs data to estimate prices.

 

  • Data can help us see and understand.
  • Data can help us identify new opportunities.
  • Data can help us resolve misunderstandings.

Healthcare

The healthcare and life sciences industries collect public health data and patient data to learn how to improve patient care and save lives.

Business

The most successful companies in many fields are data-driven. They use sophisticated data analytics to understand how the company can perform better.

Finance

Banks and insurance companies collect and evaluate data on customers, loans, and deposits to support strategic decision-making.

Data storage

The most common data collected are Numbers and Sizes. Typically, this data is stored in arrays that represent the relationships between the values.

This table shows house prices compared to area:

 

Quantitative data versus qualitative data

Quantitative data is numerical data:

  • 55 cars
  • 15 meters
  • 35 children

Qualitative data is descriptive data:

  • It's cold.
  • It's long
  • That's fun!

Inventory or sampling

Inventory is when we collect data for every member of a team.

Sampling is when we collect data for a number of members of a group.

If we want to know how many Americans smoke, we can ask everyone in America (county), or we can ask 10,000 people (sampling).

Accurate inventory is difficult to implement. Sampling is inaccurate but easier to implement.

Sampling terminology

Population is a group of individuals (subjects) from whom we want to collect information.

An inventory is information about every individual within a group of people being surveyed.

 

Sampling is information about a portion of the total population surveyed (to represent the whole).

Random sample

For a sample to be representative of the total number of people surveyed, it must be collected randomly.

A random sample is a sample in which each member of the total number of people surveyed has an equal chance of appearing in the sample.

Sampling error

Sampling bias (error) occurs when samples are collected in a way that makes some individuals less (or more) likely to be included in the sample than others are.

Big data

Big data is data that humans cannot process without the assistance of advanced machines.

Big data doesn't have a specific size definition, but datasets are constantly getting larger as we continuously collect more data and store it at increasingly lower costs.

Data mining

Big data comes with complex data structures.

A large part of the Big Data processing process involves data refinement.

Related posts
Other Technology articles
Category

System

Windows XP

Windows Server 2012

Windows 8

Windows 7

Windows 10

Wifi tips

Virus Removal - Spyware

Speed ​​up the computer

Server

Security solution

Mail Server

LAN - WAN

Ghost - Install Win

Fix computer error

Configure Router Switch

Computer wallpaper

Computer security

Mac OS X

Mac OS System software

Mac OS Security

Mac OS Office application

Mac OS Email Management

Mac OS Data - File

Mac hardware

Hardware

USB - Flash Drive

Speaker headset

Printer

PC hardware

Network equipment

Laptop hardware

Computer components

Advice Computer

Game

PC game

Online game

Mobile Game

Pokemon GO

information

Technology story

Technology comments

Quiz technology

New technology

British talent technology

Attack the network

Artificial intelligence

Technology

Smart watches

Raspberry Pi

Linux

Camera

Basic knowledge

Banking services

SEO tips

Science

Strange story

Space Science

Scientific invention

Science Story

Science photo

Science and technology

Medicine

Health Care

Fun science

Environment

Discover science

Discover nature

Archeology

Life

Travel Experience

Tips

Raise up child

Make up

Life skills

Home Care

Entertainment

DIY Handmade

Cuisine

Christmas

Application

Web Email

Website - Blog

Web browser

Support Download - Upload

Software conversion

Social Network

Simulator software

Online payment

Office information

Music Software

Map and Positioning

Installation - Uninstall

Graphic design

Free - Discount

Email reader

Edit video

Edit photo

Compress and Decompress

Chat, Text, Call

Archive - Share

Electric

Water heater

Washing machine

Television

Machine tool

Fridge

Fans

Air conditioning

Program

Unix and Linux

SQL Server

SQL

Python

Programming C

PHP

NodeJS

MongoDB

jQuery

JavaScript

HTTP

HTML

Git

Database

Data structure and algorithm

CSS and CSS3

C ++

C #

AngularJS

Mobile

Wallpapers and Ringtones

Tricks application

Take and process photos

Storage - Sync

Security and Virus Removal

Personalized

Online Social Network

Map

Manage and edit Video

Data

Chat - Call - Text

Browser and Add-on

Basic setup