Solving Big Data Industry Use Cases with AWS Cloud Computing
1. The Big Data Story Cloud IT Better 2 1 2 3 4 5 6 Big Data is getting Bigger and Bigger ! Why is Cloud Big Data’s Best Friend ? Identifying Big Data Industry Use cases Figuring Out the Big Data Life Cycle How AWS Building Blocks can Help Tame Big Data! Cloudlytics – A Big Data Use Case
2. So What is Big Data ? Simply put, Big Data is data which cannot be processed by the current tools or technologies. Big Data is too Big, too Fast, too Varied & Too Unpredictable! Cloud IT Better 3 * Twitter & Flickr Visualizations in North America
3. Cloud IT Better 4 The 4Vs - Volume VOLUME
4. Cloud IT Better 5 2.5 quintillion bytes of Data is generated everyday! 40 Zeta bytes of Data Will be created by 2020! With 2.4 Trillion GBs of data created everyday! Most Companies in the U.S. Have 100TBs of Data Stored The 4Vs - Volume
5. Cloud IT Better 6 The 4Vs - Variety
6. Cloud IT Better 7 The 4Vs - Velocity VELOCITY
7. Cloud IT Better 8 The NY Stock Exchange Captures 1TB of Trade Information Every Day Modern Cars have close to 100 Sensors measuring Fuel Level to Tire Pressure The 4Vs - Velocity
8. Cloud IT Better 9 The 4Vs - Veracity VERACITY
9. Cloud IT Better 10 27% Respondents in a Survey were unsure of How much of their Data was inaccurate. Poor Data Quality Costs the U.S. $3.1 Trillion per Year! The 4Vs - Veracity
10. Big Data is Getting Bigger and BIGGER! Cloud IT Better 11 “ More data crosses the internet EVERY SECOND than were stored in the entire internet just 20 years ago!“ “ Zuckerberg noted that 1 billion pieces of content are shared via Facebook’s Open Graph DAILY ! “ “ It is estimated that Walmart collects more than 2.5 petabytes of data EVERY HOUR from its customer transactions ”
11. IMPACT of Big Data Play In Your Industry? This McKinsey Report says How Big Data Will Impact Different Industries, But Whatever be Your Industry, you can’t survive Without Big Data after the Next 10-15 Years! So Be the Early Bird! 12Cloud IT Better
12. Ring Ring? Anyone There? The Telecom Industry You have Your Networks Spread over 1000 Cities, some with Over a Million Connections, Challenges such as: • Customer Churning & Retention • Understanding Payment Details for new Schemes • Providing Customized Payment & Service models to the Right Customers • Understand Competitor Pricing Models & faster innovation • Checking Which Offers & Schemes are popular in which Geographies 13Cloud IT Better
13. Telecom - Opportunities Using Big Data: Innovative Business Models Operational Efficiency: (10-15% Open Reduction) Precise Business Models Real Time Analysis & Decision Making: (Revenue Potential Inc. 5-10%) Create Data Driven API models for improved customer service. Creating World Class customer care, by tracking in depth subscriber activity, tracking issues and reducing call center iterations & time. Optimizing Offers based on Subscriber Network usage patterns & Traffic to come up with newer offers which is critical on driving value added service adoption. Controlling RAN(Radio Access Network) Congestion by dividing Subscribers to Individual Sub Cell Levels & by assimilating data of past geographic positions & real time data on current locations can provide priority to certain subscribers over others. Using Payment Data from Retail Chains & Outlets to create Coupons & Offers, Combining them with NFC(Near Field Communication) to increase Buying frequency of Customer Anticipating & Implementing Network Planning even before the demand & predict Network stress points & Under utilized Network areas. Help Service providers understand what behaviors will trigger churn events & what actions will prevent churning, by dynamic offers created by complaint triggers in real time. Reducing Churn Rates by 8-12% Cyber Cop Initiatives where pattern matches of subscriber activities can be used to detect malicious activities and determine traffic changing abnormal consumptions to predict Fraud activities. 14Cloud IT Better
14. What are You Buying Today? The Retail Industry Imagine Your Retail Outlets over a 100s of Cities, with more than 10 outlets per city. Even if you have a 10000 Customers per month buying at a frequency of 5. You will end up With 5,00,00,000 unique records! This Does not even take into Consideration the Age, Gender, Geographic Trend, Whether, Time of the month & more! Challenges: The Number Jumble : • Buying Patterns • Shopping Offers • Cross Selling Success • Loyalty & Retention • Effective Marketing Campaigns • Predictive Demands • Dynamic Price Optimization 15Cloud IT Better
15. Retail - Opportunities Using Big Data: Personalized Recommendations Dynamic Pricing In Store Experience Micro segmentation & inventory management Data Collected based on previous online & offline purchases, even online clicks, likes & wish lists are recorded to generate recommendations in real time. Online Shoppers are given reduced prices based on data on time of the day, Festive offers validity period, loyalty of customers and more. Geo-Fencing which allows retailers to provide real time offers to customers on their cell phones(based on their previous shopping sprees) as they enter a geo fenced area. Segmenting Customers have been taken to the next level, with social media interactions, marketing campaign results, wish lists. Targeted offers are now made to granular customer segments with promo codes & coupons! Online Shoppers are given recommendations at reduced prices based on their previous purchase trends. Amazon.com has increased their sales volumes by 25% on this. Offline shoppers use these data driven approaches to map shopping patterns and based on proximity of customers allow price variations with RFID price tags. Check out Our Blogs for MORE Optimized Product Placement is done Scientifically where algorithms check buyer tendencies & have products placed in the right geography, this becomes very important for bigger players like Wal-Mart & Macy's. With Big Data Retailers can get predictive analytics on prices as they fluctuate through the supply chain. This allows them to set prices, and also react proactively to demand spikes to avoid over stock-outs. 16Cloud IT Better
16. Will My Policy cover this Accident ? Image Courtesy: blairingle Challenges - Insurance Industry $80 Billion LOSS in U.S. per Year due to Frauds! 15% of premium costs in South Africa are Frauds! Claims on Automobiles have a 25-33% Fraud ! Are Your Claims Fraud? • What Policies are best Suited for Customers? • How to tackle Increasing Diseases & ailments? • How to reduce customer Hassles?
17. Insurance - Opportunities Using Big Data: Fraud Detection Turning the Claim Centric Approach Person Centric. Using Cohort analytics to track Social activities of beneficiary & associated parties. Integration of these Data streams of Information to detect fraud patterns for Future (predictive analysis) Delighting the Customer Variety of Customer records can be stored in No SQL databases, and real time integration to multiple sources to optimize the process of the insurance validation & reimbursements. Customer Call Logs & interaction with staff can be checked for Sentiment analysis, to optimize insurance processes & reduce iterations base on negative customer calls. Predictive Analysis Understand Customer Lifestyle by integrating feeds of social networks to determine Disease Patterns so that insurance schemes 10-15 years into the future insurance companies can identify these degenerating lifestyles & offer schemes at higher premiums. Improving Product Opportunities Checking out the Success of Insurance Schemes & Which are most popular, to drive similar scheme models & understand why other schemes are not popular. This can be done by mapping the successful customer base lifestyle trends. 18Cloud IT Better
18. I Want My Car to Drive on its OWN! Connected Cars is No Longer a Concept! Number of Internet Capable Vehicles in Europe 48Mil by 2016! There are more than 74 Sensors in Ford’s Connected Cars! These Hybrid Cars can generate 25GBs/Hr of Data!! 19Cloud IT Better
19. Automotive - Opportunities Using Big Data Vehicle Insurance Using "Telematics", driver's driving patterns can be analyzed. These can be used by insurance companies to give out alerts, warnings in real time & even give personalized pricing. Integrating with Geo Fencing & Social Media GPS trackers can provide customized alerts as vehicles pass a particular location. These alerts are real time & based on your social media likes & shared combined with discount coupons & offers! Self Repair & Maintenance Your Car's intelligence system will keep a track of all parts & liquids to be changed or repaired for periodic maintenance, giving you real time alerts as you pass repair shops, which will also bid for discounted pricing! Learning from The mistakes Using The Black Box mechanism similar to airplanes, product engineers can understand if any vehicle part was the cause & how parts can be improved in design to reduce future accidents. 20Cloud IT Better
20. Don’t Watch that Movie! It’s Pathetic! Challenges – The Media Industry • What Content is popular? Did You Know? You Tube Users Upload 48 Hrs of Video Every Minute! • Where are my viewers coming from? • What are my viewers opinions about my content? • How Do I monetize my Content? 21Cloud IT Better
21. Media - Opportunities Using Big Data Predictive analysis Using Big Data Tools to analyze current content viewed, the storylines, characters etc. to determine which type of movies and or soaps are going to be a success in the future. Log Analysis Analyzing Viewer demographics, the popular content, devices used to view and download data, detecting spams, browsers and OS used to generate actionable reports driving business decisions. Sentiment Analysis Tracking user comments, likes, shares, tweets & other user interactions with media content on social networks to track popularity & promote content similar to the hits. Website Optimization Based on Navigational pattern analysis that are popular among users, website builders can optimize the ease of reach of the content. Ad Targeting & Scope to Monetize Ad servers based on Visitor Cookie analysis & bid values generated sometimes even in real time, deliver ads to visitors in real time & continuously update for successful clicks & failures. Learn more on Ad serving with AWS Cloud. 22Cloud IT Better
22. When will my Loan Get Sanctioned ? 23Cloud IT Better Challenges – The Banking Industry • Detecting Frauds • Which Schemes for which customers? • When is my Customer Not Happy? • How Do I Segment my customer base?
23. Banking - Opportunities Using Big Data Creating Customer Segmentation Banks are now pooling in all types of customer buying patterns, lifestyle habits & interests to create segmentations. With this 360 degree view from Big Data Analytics integration, banks can now customize product offerings & re distribute spending from non profitable to profitable customers. Fraud Detection Financial & Banking Institutes are using credit/debit card purchases to understand spending habits and detect suspicious patterns of buying to detect frauds. Customer Sentiment Analysis Banks and Financial Institutions can now track the the success or failure of their product or schemes as they integrate social sentiments of their products and track user complaints. Sales And Marketing Campaigns Using 360 Degree Customer Insights banks are generating smarter marketing & sales campaigns integrating them with offers & schemes that are more successful to the different customer segments. Analyzing Voice Sentiments Many Banks are now using highly unstructured data, such as customer voices, and using complex data analysis to track customer complaints . They are also trying to integrate the information with the transactional data warehouse to reduce attrition, drive up sales & even detect frauds.
24. Let us Figure out the Big Data Life Cycle In order to make the entire process of Big Data more tangible, it is divided into 4 stages: Generation Collection & Store Analyze & Computation Data Collaboration & Sharing
25. Generation Generating the Data Structured Data – Employee Records Semi Structured Data – End User Logs Unstructured Data – Social User Profile images Data Mining Log file analysis Machine learning Web indexing Financial analysisScientific simulations Data warehousing Bioinformatics research Web based APIs can be used to access this data and Store it.
26. Fitting AWS Cloud Components AWS Direct Connect AWS Storage Gateway AWS Import/Export Establish a dedicated network connection from your premises to AWS Secure Integration between an On-premises IT & AWS’s storage infrastructure Move large amounts of data into and out of AWS using portable storage devices for transport Cloud IT Better 27
27. Transferring Your Data to AWS Cloud AWS Direct Connect AWS Storage Gateway AWS Import/Export Establish a dedicated network connection from your premises to AWS Secure Integration between an On-premises IT & AWS’s storage infrastructure Move large amounts of data into and out of AWS using portable storage devices for transport Cloud IT Better 28
28. Collecting & Storing Data on AWS Cloud Write, read, and delete objects containing from 1 byte to 5 terabytes of data each. A full featured relational databases giving you access to capabilities of a MySQL, Oracle, SQL Server, or PostgreSQL databases engines. Relational Database Service (RDS) A fast, fully managed NoSQL database service making it simple & cost-effective to store & retrieve any amount of data, and serve any level of request traffic. AWS DynamoDB Cloud IT Better 29 Simple Storage Service (S3)
29. Data Analysis, Retrieval & Automation Amazon Elastic Map Reduce (EMR) A managed Hadoop distribution by Amazon Web Services using customized Apache Hadoop framework. It integrates with AWS S3 & EC2. Cloud IT Better 30 Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse service making it simple & cost-effective to efficiently analyze all your data using your existing business intelligence tools. AWS Redshift This allows users to define a dependent chain of data sources and destinations with an option to create data processing activities called pipeline. AWS Data Pipelines ANALYSIS RETRIEVAL AUTOMATION
30. AWS Kinesis (Big Data in Real Time) Amazon Kinesis is a fully managed service for real-time processing of streaming data at massive scale. Amazon Kinesis can collect and process hundreds of TBs of data/hr from hundreds of thousands of sources. • Real Time Processing allowing you to answer questions about the current state of your data. • Amazon Kinesis automatically provisions & manages the storage required to reliably & durably collect your data stream. • Your Kinesis Streams are connected to your Kinesis App from which you can use DynamoDB or Redshift to process complex queries at real Time. Image courtesy: https://static.gosquared.com/images/liquidicity/kinesis/ Cloud IT Better 31
31. The Big Data Life cycle - Compiled AWS S3 AWS RDS AWS DynamoDB AWS EMR AWS Data Pipeline Generation Collection & Store Analyze & Computation Data Collaboration & Sharing AWS S3 AWS RDS AWS DynamoDB AWS Redshift AWS Data Pipeline AWS Data Pipeline Cloud IT Better 32
32. Use Case - Cloudlytics Cloudlytics is a Pay-as-you-Go, SaaS based Log Analytics Tool powered by AWS. It Takes the Big Data Approach using AWS Components such as EMR & Redshift. Customer Log Files Stored in S3 Processing Processed Data Customer Reports Cloud IT Better.