Official Google Cloud Certified Professional Data Engineer Study Guide

Sofort lieferbar | Lieferzeit: Sofort lieferbar I

71,16 €*

Alle Preise inkl. MwSt.|Versandkostenfrei
ISBN-13:
9781119618430
Veröffentl:
2020
Erscheinungsdatum:
07.06.2020
Seiten:
352
Autor:
Dan Sullivan
Gewicht:
656 g
Format:
234x189x22 mm
Sprache:
Englisch
Beschreibung:

The proven Study Guide that prepares you for this new Google Cloud examThe Google Cloud Certified Professional Data Engineer Study Guide, provides everything you need to prepare for this important exam and master the skills necessary to land that coveted Google Cloud Professional Data Engineer certification. Beginning with a pre-book assessment quiz to evaluate what you know before you begin, each chapter features exam objectives and review questions, plus the online learning environment includes additional complete practice tests.Written by Dan Sullivan, a popular and experienced online course author for machine learning, big data, and Cloud topics, Google Cloud Certified Professional Data Engineer Study Guide is your ace in the hole for deploying and managing analytics and machine learning applications.* Build and operationalize storage systems, pipelines, and compute infrastructure* Understand machine learning models and learn how to select pre-built models* Monitor and troubleshoot machine learning models* Design analytics and machine learning applications that are secure, scalable, and highly available.This exam guide is designed to help you develop an in depth understanding of data engineering and machine learning on Google Cloud Platform.
Introduction xxiiiAssessment Test xxixChapter 1 Selecting Appropriate Storage Technologies 1From Business Requirements to Storage Systems 2Ingest 3Store 5Process and Analyze 6Explore and Visualize 8Technical Aspects of Data: Volume, Velocity, Variation, Access, and Security 8Volume 8Velocity 9Variation in Structure 10Data Access Patterns 11Security Requirements 12Types of Structure: Structured, Semi-Structured, and Unstructured 12Structured: Transactional vs. Analytical 13Semi-Structured: Fully Indexed vs. Row Key Access 13Unstructured Data 15Google's Storage Decision Tree 16Schema Design Considerations 16Relational Database Design 17NoSQL Database Design 20Exam Essentials 23Review Questions 24Chapter 2 Building and Operationalizing Storage Systems 29Cloud SQL 30Configuring Cloud SQL 31Improving Read Performance with Read Replicas 33Importing and Exporting Data 33Cloud Spanner 34Configuring Cloud Spanner 34Replication in Cloud Spanner 35Database Design Considerations 36Importing and Exporting Data 36Cloud Bigtable 37Configuring Bigtable 37Database Design Considerations 38Importing and Exporting 39Cloud Firestore 39Cloud Firestore Data Model 40Indexing and Querying 41Importing and Exporting 42BigQuery 42BigQuery Datasets 43Loading and Exporting Data 44Clustering, Partitioning, and Sharding Tables 45Streaming Inserts 46Monitoring and Logging in BigQuery 46BigQuery Cost Considerations 47Tips for Optimizing BigQuery 47Cloud Memorystore 48Cloud Storage 50Organizing Objects in a Namespace 50Storage Tiers 51Cloud Storage Use Cases 52Data Retention and Lifecycle Management 52Unmanaged Databases 53Exam Essentials 54Review Questions 56Chapter 3 Designing Data Pipelines 61Overview of Data Pipelines 62Data Pipeline Stages 63Types of Data Pipelines 66GCP Pipeline Components 73Cloud Pub/Sub 74Cloud Dataflow 76Cloud Dataproc 79Cloud Composer 82Migrating Hadoop and Spark to GCP 82Exam Essentials 83Review Questions 86Chapter 4 Designing a Data Processing Solution 89Designing Infrastructure 90Choosing Infrastructure 90Availability, Reliability, and Scalability of Infrastructure 93Hybrid Cloud and Edge Computing 96Designing for Distributed Processing 98Distributed Processing: Messaging 98Distributed Processing: Services 101Migrating a Data Warehouse 102Assessing the Current State of a Data Warehouse 102Designing the Future State of a Data Warehouse 103Migrating Data, Jobs, and Access Controls 104Validating the Data Warehouse 105Exam Essentials 105Review Questions 107Chapter 5 Building and Operationalizing Processing Infrastructure 111Provisioning and Adjusting Processing Resources 112Provisioning and Adjusting Compute Engine 113Provisioning and Adjusting Kubernetes Engine 118Provisioning and Adjusting Cloud Bigtable 124Provisioning and Adjusting Cloud Dataproc 127Configuring Managed Serverless Processing Services 129Monitoring Processing Resources 130Stackdriver Monitoring 130Stackdriver Logging 130Stackdriver Trace 131Exam Essentials 132Review Questions 134Chapter 6 Designing for Security and Compliance 139Identity and Access Management with Cloud IAM 140Predefined Roles 141Custom Roles 143Using Roles with Service Accounts 145Access Control with Policies 146Using IAM with Storage and Processing Services 148Cloud Storage and IAM 148Cloud Bigtable and IAM 149BigQuery and IAM 149Cloud Dataflow and IAM 150Data Security 151Encryption 151Key Management 153Ensuring Privacy with the Data Loss Prevention API 154Detecting Sensitive Data 154Running Data Loss Prevention Jobs 155Inspection Best Practices 156Legal Compliance 156Health Insurance Portability and Accountability Act (HIPAA) 156Children's Online Privacy Protection Act 157FedRAMP 158General Data Protection Regulation 158Exam Essentials 158Review Questions 161Chapter 7 Designing Databases for Reliability, Scalability, and Availability 165Designing Cloud Bigtable Databases for Scalability and Reliability 166Data Modeling with Cloud Bigtable 166Designing Row-keys 168Designing for Time Series 170Use Replication for Availability and Scalability 171Designing Cloud Spanner Databases for Scalability and Reliability 172Relational Database Features 173Interleaved Tables 174Primary Keys and Hotspots 174Database Splits 175Secondary Indexes 176Query Best Practices 177Designing BigQuery Databases for Data Warehousing 179Schema Design for Data Warehousing 179Clustered and Partitioned Tables 181Querying Data in BigQuery 182External Data Access 183BigQuery ML 185Exam Essentials 185Review Questions 188Chapter 8 Understanding Data Operations for Flexibility and Portability 191Cataloging and Discovery with Data Catalog 192Searching in Data Catalog 193Tagging in Data Catalog 194Data Preprocessing with Dataprep 195Cleansing Data 196Discovering Data 196Enriching Data 197Importing and Exporting Data 197Structuring and Validating Data 198Visualizing with Data Studio 198Connecting to Data Sources 198Visualizing Data 200Sharing Data 200Exploring Data with Cloud Datalab 200Jupyter Notebooks 201Managing Cloud Datalab Instances 201Adding Libraries to Cloud Datalab Instances 202Orchestrating Workflows with Cloud Composer 202Airflow Environments 203Creating DAGs 203Airflow Logs 204Exam Essentials 204Review Questions 206Chapter 9 Deploying Machine Learning Pipelines 209Structure of ML Pipelines 210Data Ingestion 211Data Preparation 212Data Segregation 215Model Training 217Model Evaluation 218Model Deployment 220Model Monitoring 221GCP Options for Deploying Machine Learning Pipeline 221Cloud AutoML 221BigQuery ML 223Kubeflow 223Spark Machine Learning 224Exam Essentials 225Review Questions 227Chapter 10 Choosing Training and Serving Infrastructure 231Hardware Accelerators 232Graphics Processing Units 232Tensor Processing Units 233Choosing Between CPUs, GPUs, and TPUs 233Distributed and Single Machine Infrastructure 234Single Machine Model Training 234Distributed Model Training 235Serving Models 236Edge Computing with GCP 237Edge Computing Overview 237Edge Computing Components and Processes 239Edge TPU 240Cloud IoT 240Exam Essentials 241Review Questions 244Chapter 11 Measuring, Monitoring, and Troubleshooting Machine Learning Models 247Three Types of Machine Learning Algorithms 248Supervised Learning 248Unsupervised Learning 253Anomaly Detection 254Reinforcement Learning 254Deep Learning 255Engineering Machine Learning Models 257Model Training and Evaluation 257Operationalizing ML Models 262Common Sources of Error in Machine Learning Models 263Data Quality 264Unbalanced Training Sets 264Types of Bias 264Exam Essentials 265Review Questions 267Chapter 12 Leveraging Prebuilt Models as a Service 269Sight 270Vision AI 270Video AI 272Conversation 274Dialogflow 274Cloud Text-to-Speech API 275Cloud Speech-to-Text API 275Language 276Translation 276Natural Language 277Structured Data 278Recommendations AI API 278Cloud Inference API 280Exam Essentials 280Review Questions 282Appendix Answers to Review Questions 285Chapter 1: Selecting Appropriate Storage Technologies 286Chapter 2: Building and Operationalizing Storage Systems 288Chapter 3: Designing Data Pipelines 290Chapter 4: Designing a Data Processing Solution 291Chapter 5: Building and Operationalizing Processing Infrastructure 293Chapter 6: Designing for Security and Compliance 295Chapter 7: Designing Databases for Reliability, Scalability, and Availability 296Chapter 8: Understanding Data Operations for Flexibility and Portability 298Chapter 9: Deploying Machine Learning Pipelines 299Chapter 10: Choosing Training and Serving Infrastructure 301Chapter 11: Measuring, Monitoring, and Troubleshooting Machine Learning Models 303Chapter 12: Leveraging Prebuilt Models as a Service 304Index 307

Kunden Rezensionen

Zu diesem Artikel ist noch keine Rezension vorhanden.
Helfen sie anderen Besuchern und verfassen Sie selbst eine Rezension.

Google Plus
Powered by Inooga