Introduction to the World of Data Science Data Science is the most revolutionary technology of the era It s all about deriving useful insights from data in order to solve real world complex problems The first module is a reduction to data science that covers all the basic fundamentals of data science followed by this The next module is the supervised learning algorithms module where we ll start by understanding the most basic With theme or which is linear regression We ll discuss how Walmart is using insightful patterns from its database to increase the potential of its business After that We will see what exactly data science is then we ll move on and discuss who data scientist is and we will also discuss the various skill sets After this we will discuss how data is extracted processed and finally use as a solution We ll discuss a use case of k means clustering after which we can move on to see the various data science job roles such as data analyst data architect data engineer We produce 2 5 quintillion bytes of data each day And this is only accelerating with the growth of IoT or the Internet of Things IoT data is measured in zettabytes and one zettabyte is equal to trillion gigabytes According to a recent survey by Cisco It s estimated that by the end of 2019 the IoT will generate more than five hundred zettles of data per year This number will only increase over time Social media is generating a lot of data for us Data science is a simple process that will just extracts useful information from data Walmart is the world s biggest retailer with over 20 000 stores in just 28 countries We pay bills online We even buy homes online these days you can even sell your pets on oil excuses Walmart is currently building the world s biggest Good Cloud which will be able to process two point five petabytes of data every hour now The reason behind Walmart s success is how the user customer data to get useful insights about customers shopping patterns Walmart found out that strawberry Pop Tart sales increased by seven times before a hurricane Walmart uses data in a very effective manner and the analyzer very well They process the data very well and they find out the useful insights that they need in order to get more customers or improve their business Netflix analyzes the movie viewing patterns of users to understand what drives user interest and to see what users want to watch There are actually many machine learning algorithms that are based on linear algebra So guys overall you need to have a good understanding of math and apart from that data scientist Eli s technology So data scientists have to be really good with technology It is also important for a data scientist to be a tactical business consultant Programming languages are a must at the minimum You should know our python and database query language now Data extraction and processing are all about getting data From these different data sources and then putting it in a format so that you can analyze it now next is data wrangling and exploration There are many job roles in data science Data scientists have to understand business and offer the best solution using data analysis and data processing Data visualization is one of the most important parts of data analysis The challenge is over the business and they have to offer the best solution Data is being generated at an Unstoppable Pace A data analyst is responsible for a variety of tasks including visualization processing of massive amounts of data and among them They have to also perform queries on databases A data architect creates the blueprints for data management so that the databases can be easily integrated centralized and protected They also ensure that the data Engineers have the best tools and systems A business analyst acts as a link between the data engineers and the management Executives A data and analytics manager is responsible for data science operations Data mining is the process of gathering data from different sources at this stage some of the questions you can ask yourself are what data do I need for my project Where does it live How can I obtain it How do you obtain it And what is the most efficient way to store and access all of it Then there is data exploration modeling and finally deployment The data science process involves a lot of time and effort to find the right data Data cleaning is the most time consuming task in the data science process The data exploration stage is basically the brainstorming of data analysis This is where you try to explore the different models that can be applied to your data next up

