Introduction
Data collection means gathering information to answer questions, make decisions, or understand something better.
Methods of Data Collection
Surveys
Surveys collect information by asking people questions.
Example: Asking classmates about their favourite ice cream flavour.
Good Survey Practices
- Ask clear and simple questions
- Keep the survey short
- Use multiple-choice or rating questions
- Keep answers anonymous
- Test the survey first
- Analyze results carefully
Questionnaires
Written forms with a set of questions.
Example: A school questionnaire asking students about activities.
Interviews
One-on-one discussions for detailed information.
Example: Interviewing a teacher about their experience.
Observations
Watching real situations to understand behavior.
Online Data Sources
Websites, databases, and digital tools.
Data Extraction
Data extraction means selecting and saving only the important information from a large amount of data.
Types of Data (Based on Storage)
Structured Data
Structured data is well-organized and easy to search. It is stored in rows and columns, like spreadsheets or databases.
Example:A table with student ID, name, class, date of birth, and fees.
Unstructured Data
Unstructured data has no fixed format. This data is harder to organize but still very useful.
Example:Emails, social media posts, images, videos, and text messages.
Data Visualization
Data visualization means showing data using pictures like charts and graphs. It helps us understand data easily by showing patterns and trends.
Importance
- Makes data easy to understand
- Saves time
- Helps quickly see trends and comparisons
- Easier than reading long lists of numbers
Data Pre-Processing & Analysis
Data Pre-Processing
Cleaning and organizing data before analysis.
Data Pre-Processing Techniques
1. Checking Data Quality
Ensure data is correct, complete, and up-to-date.
ExampleEnsuring every enrolled student has an accurate, up-to-date test score recorded.
2. Common Problems
- Errors:Wrong data values i.e. Score recorded as 105 when maximum is 100
- Outliers: Values that are unusually high or low compared to the rest of the data.i.e. One student scoring 5 when most students scored between 50 and 80.
- Bias: The distortions that effect the accuracy of data. i.e. Using survey results from one school to represent all schools in a city.
Validation & Cleaning
Validation: Check that data is complete and correct.
Cleaning: Fix or remove wrong data, fill missing values, and handle outliers.
Data Analysis Techniques
- Quantitative: Uses numbers and measurements to find patterns and trends.
- Qualitative:Uses non-numerical data like text, images, and sounds to understand meanings and experiences.
Cloud Storage
Cloud Storage: Cloud storage allows data to be stored on the internet. It helps in saving files, sharing data, and accessing information from any device.
Benefits
- Access from anywhere
- Safe backups
- Real-time sharing
Remote Access
Remote access allows you to use a computer or network from another location. You can open files or software even when you are not physically present.
Example: Google drive, One Drive are examples of remote access.
Data Backups
Data backups are copies of important files stored separately to prevent data loss. They protect data from deletion, system failure, or viruses.
Example: Saving a school project on Google Drive or a USB.
Automatic Backups
Devices can be set to automatically back up data to cloud services like OneDrive.
MCQs
1. What is data collection?
- A) Data deletion
- B) Gathering information
- C) Data storage
- D) Data visualization
Answer: B) Gathering information
2. Which method involves one-on-one discussion?
- A) Survey
- B) Observation
- C) Interview
- D) Questionnaire
Answer: C) Interview
3. Data stored in rows and columns is called:
- A) Unstructured Data
- B) Raw Data
- C) Structured Data
- D) Visual Data
Answer: C) Structured Data
4. Which of the following is an example of unstructured data?
- A) Spreadsheet
- B) Database table
- C) Image
- D) CSV file
Answer: C) Image
5. Data visualization helps to:
- A) Hide data
- B) Make data harder
- C) Understand trends easily
- D) Delete data
Answer: C) Understand trends easily