Tableau Prep: Up & Running / Tableau Prep: настраиваем и запускаем Год издания: 2020 Автор: Allchin C. / Оллчин К. Издательство: O’Reilly ISBN: 978-1-492-07962-0 Язык: Английский Формат: PDF Качество: Издательский макет или текст (eBook) Интерактивное оглавление: Да Количество страниц: 444 Описание: For self-service data preparation, Tableau Prep is relatively easy to use—as long as you know how to clean and organize your datasets. Carl Allchin from The Information Lab in London gets you up to speed on Tableau Prep through a series of practical lessons that include methods for preparing, cleaning, automating, organizing, and outputting your datasets. Based on Allchin’s popular blog, Preppin’ Data, this practical guide takes you step-by-step through Tableau Prep’s fundamentals. Self-service data preparation reduces the time it takes to complete data projects and improves the quality of your analyses. Discover how Tableau Prep helps you access your data and turn it into valuable information. Know what to look for when you prepare data Learn which Tableau Prep functions to use when working with data fields Analyze the shape and profile of your dataset Output data for analysis and learn how Tableau Prep automates your workflow Learn how to clean your dataset using Tableau Prep functions Explore ways to use Tableau Prep techniques in real-world scenarios Make your data available to others by managing and documenting the output
Примеры страниц
Оглавление
Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xix 1. Why Self-Service Data Prep?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 A Short History of Self-Service Data Visualization 1 Accessing the “Right Data” 2 The Self-Service Data Preparation Opportunity 3 Tableau Prep Up and Running 4 Summary 4 Part I. Getting Started 2. Getting Started with Tableau Prep Builder. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Where to Get Tableau Prep Builder 6 How to Get a License for Prep Builder 7 The Tableau Prep Builder Screen 9 Basic Steps of Data Preparation 11 Input Step 11 Clean Step 12 Output Step 13 Saving a Flow 13 Summary 15 3. Planning Your Prep. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Stage 1: Know Your Data 17 Stage 2: Identify the Desired State 18 Stage 3: Determine the Required Transitions from KYD to the Desired State 20 Stage 4: Build the Workflow 22 Summary 24 4. Shaping Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 What to Look for in Incoming Data Sets 25 What Shape Is Best for Analysis in Tableau? 26 Changing Data Set Structures in Prep Builder 28 Pivot 28 Aggregate 29 Join 30 Union 31 Applying Restructuring Techniques to the Ice Cream Example 32 Step 1: Pivot Columns to Rows 32 Step 2: Pivot Rows to Columns 32 Summary 33 5. Connecting to Data in Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 Files Upon Files Upon Files 34 Spreadsheets 35 Other File Types 35 Where to Find Your Data Files 36 How to Connect to Files in Prep 37 Considerations for Saving Flows with File Inputs 39 Summary 39 6. Connecting to a Database. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 What Is a Database? 40 How to Connect to a Database Within Prep Builder 42 When to Avoid Connecting to a Database 45 Summary 46 Part II. Data Types 7. Dealing with Numbers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 What Do We Mean by Numbers? 49 Types of Numbers 49 Category or Measure? 49 Aggregation 50 Formatting Numbers 50 Functions for Mastering Numerical Data 51 Summary 53 8. Dealing with Dates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 Why Are Dates Important? 54 Parts of a Date 54 Date Lookup Tables 55 Epoch Dates 56 Excel Serial Number 57 Entering Dates 57 The makedate() Function 58 The dateparse() Function 58 Summary 59 9. Dealing with String Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 What Do We Mean by Strings? 60 How String Data Is Different 61 Character Order 61 Formatting Considerations 61 Common Functions for Preparing String Data 63 Grouping and Replace Options for Working with String Data 65 Summary 65 10. Dealing with Boolean Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 What Is Boolean Data? 66 Why Is It So Useful in Data Analysis? 66 Functions Featuring Boolean Logic 68 Summary 74 Part III. The Shape of Data 11. Profiling Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 What Is a Profile? 76 Why Visualizing the Data Set Is Important 77 Anscombe’s Quartet 77 Visualizations Versus Data Tables 79 How Prep Builder Profiles Data 80 Generating Histograms and Mini-Histograms 80 Selecting Summary Versus Detail Views 82 Highlighting Values 83 Viewing Dimension Counts 84 Sorting 85 Summary 85 12. Sampling Data Sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 One Simple Rule: Use It All If Possible 86 Sampling to Work Around Technical Limitations 86 Volume of Data 87 Velocity of Data 87 Other Reasons for Sampling 88 Reduce Build Times 88 Determine What You Need 88 Sampling Techniques 89 Fixed Number of Rows 89 Random Sample 90 When Not to Sample 91 Summary 92 13. Pivoting Columns to Rows. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 When to Pivot in Tableau Prep Builder 93 How to Pivot Columns to Rows 95 Summary 99 14. Pivoting Rows to Columns. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 When to Use a Rows-to-Columns Pivot 100 How to Pivot Rows to Columns 101 Summary 103 15. Aggregating in Prep Builder. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 Comparing Calculations in Prep Builder and Desktop 104 Which Calculations in Prep Builder Differ? 105 Adding the Aggregate Step 109 Where’s the Rest of My Data? 113 Level of Detail Calculation Option 114 Summary 114 16. Joining Data Sets Together. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 How to Join Data Sets in Prep Builder 115 Join Logic and Terminology 119 Types of Join in Prep Builder 120 When to Use Each Join Type 125 Summary 126 17. Unioning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 What Is a Union? 127 What If the Data Structure Isn’t Identical? 128 When to Union Data 130 Monthly Data Sets 130 Data Sets from Web Sources 131 Company Mergers 133 Multiple Tables and Wildcard Unions 133 Summary 135 18. Calculations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 What Do Calculations Do in Data Preparation? 136 Creating a Calculated Field 137 Fundamentals of Calculations 139 The Reference List 139 Syntax 140 Description 141 Example 141 Building the Calculation 141 When Calculations Go Well 141 When Calculations Go Poorly 143 Editing Calculated Fields 144 Recommendations 144 Types of Calculations 145 Numerical Calculations 146 String Calculations 146 Date Calculations 146 Conditional Calculations with a Boolean Output 146 Logical Calculations 146 Type Conversions 146 Level of Detail and Ranking Calculations 147 Summary 148 Part IV. Output 19. Choosing an Output. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 Types of Output 150 Publish to Files 151 Publish to Tableau Server 152 When to Output Data in Prep Builder 152 Outputting Data in the Output Step 152 Previewing Output Data in Desktop 155 Other Considerations for Output Data 157 Summary 157 20. Outputting to a Database. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 When to Write to a Database 158 Clean Data 158 Simplified Joins 159 Staging and Reference Tables 159 Setup for Writing to a Database 159 What to Watch Out For 163 Summary 163 21. Getting Started with Tableau Prep Conductor. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 When to Use Prep Conductor 164 How to Get Prep Conductor 165 Loading a Flow to Prep Conductor 165 Other Benefits of Using Prep Conductor 172 Summary 173 Part V. Cleaning Data 22. Creating Additional Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 When Not to Create Data 175 Dynamic Calculations in Desktop 175 Duplicate Records from Joins 177 Creating Additional Columns 177 Using Calculations 178 Pivoting Rows to Columns 178 Joining Data Sets 179 Creating Additional Rows 180 Pivoting Columns to Rows 180 Unioning Data Sets 180 Scaffolding Data Sets 181 Joining Data Sets 181 Summary 181 23. Filtering. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 What Is a Filter? 182 Different Types of Filters 183 Selection 183 Calculation 186 Wildcard 188 Null Values 188 When to Filter Out Columns 189 When to Filter Out Rows 189 Summary 190 24. Removing Data During Input. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 Changing Your Data Set Before Loading It 191 Slow Performance, Slow Build, Slow Output 192 Removing Columns 194 Removing Records 196 Summary 198 25. Splitting Data Fields. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 Basic Splits 200 Advanced Splits: When Automatic Splits Don’t Work as Intended 202 When Not to Split Data 204 Address Data 204 No Clear Delimiter 205 Summary 205 26. Cleaning by Grouping Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206 What Does Grouping Mean? 206 Why Use Grouping 207 Improving Accuracy 207 Navigating the Data Hierarchy 207 Smoothing Reorganizations 208 Grouping Techniques 209 Manual 209 Calculations 212 Built-in Functionality 213 Summary 217 27. Dealing with Nulls. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218 What Is a Null? 218 When Is a Null OK? 219 How to Remove or Replace a Null 221 ISNULL() 221 ZN() 222 Merge 223 Summary 225 28. Using Data Roles. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226 How to Use Data Roles 227 Custom Data Roles 229 Summary 233 29. Dealing with Unwanted Characters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234 What Is an Unwanted Character? 234 Issues Caused by Unwanted Characters 235 Removing Unwanted Characters 237 Strings with Mistyped Characters 238 Numbers with Unwanted Characters 239 Dates with Mistyped Characters 240 Summary 241 30. Deduplicating. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242 How to Identify Duplicates 242 Causes of Duplicates 243 System Loads 243 Row per Measure 244 Joins 245 How to Handle Duplicates 246 Aggregating: Technique 1 246 Aggregating: Technique 2 248 Pivoting Rows to Columns 250 Summary 251 31. Using Regular Expressions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252 What Are Regular Expressions? 252 How to Use Regexes in Prep 252 REGEXP_EXTRACT() and REGEXP_EXTRACT_NTH() 253 REGEXP_MATCH() 253 REGEXP_REPLACE() 254 Regex Use Cases 254 Replacing Common Mistakes 254 Anonymizing Comments or Feedback 255 Common Regex Commands 256 Summary 256 32. Completing Advanced Joins. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257 Multiple Join Conditions 257 Join Conditions Other Than Equals 260 Filtering with a Join 261 Joining by a Range 262 OR Statements 265 Summary 266 33. Creating Level of Detail Calculations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267 What Is Appending? 267 Exploring Appending Through LOD Calculations 267 When to Use an LOD Calculation 267 How to Write an LOD Calculation in Prep Builder 269 What a Level of Detail Calculation Is Doing 273 Summary 275 34. Doing Analytical Calculations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276 What Is a Table Calculation? 276 Applying Table Calculation Logic in Prep Builder 279 Keywords 280 Analytical Calculations 280 Use Cases 284 Filtering for the Top N 284 Filtering Out a Percentage of Data 285 Summary 286 Part VI. Beyond the Basics 35. Breaking Down Complex Data Preparation Challenges. . . . . . . . . . . . . . . . . . . . . . . . . 288 The Challenge 288 Where to Begin 289 Logical Steps 291 Making Changes 294 Be Ready to Iterate 295 Summary 297 36. Handling Free Text. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298 What Is Free Text? 298 Why Is Free Text Useful? 298 How to Analyze Free Text in Tableau 299 Split the Strings 300 Pivot Columns to Rows 301 Clean Cases and Punctuation 303 Use a Join to Remove Common Words 303 Group the Remaining Values 305 Summary 307 37. Using Smarter Filtering. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308 Calculations 308 Boolean Calculations 308 Logical Calculations 309 Regex Calculations 310 Join Ranges 312 Percentage Variance 312 Manual Entry: Level of Detail Calculations 312 Reloaded Data: Join to Previous Output 315 Aggregating the Average Production Cost per Type 316 Joining the Data Sets Together 316 Combining Techniques 317 Summary 318 38. Managing Conversion Rates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319 Challenges of Conversion Rates 319 Applying Conversion Rates in Prep 320 Step 1: Create a Consistent Granularity of Data for the Conversion 320 Step 2: Join the Data Sets Together 321 Step 3: Apply the Conversion Rate 322 Long-Term Strategies for Conversion Rates 322 Managing Frequency 322 Maintaining History Tables 322 Summary 323 39. Scaffolding Your Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324 What Is Scaffolding? 324 Challenges Addressed by Scaffolding 327 Challenges Created by Scaffolding 327 The Traditional Scaffolding Technique 328 Step 1: Input the Data Sets 329 Step 2: Build the Join Calculations 329 Step 3: Join the Two Data Sets Together 331 Step 4: Filter Out Unnecessary Rows 332 The Newer Scaffolding Technique 334 Step 1: Input the Data Sets 335 Step 2: Join the Data Sets 335 Step 3: Add the Reporting Date 336 Step 4: Remove the Scaffold Value 336 The Result 337 Summary 337 40. Connecting to Programming Scripts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338 When to Use the Script Step in Prep 338 Setting Up Your Computer to Use Scripts in Prep 339 Using a Script Step 343 Summary 345 41. Handling Prep Builder Errors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346 Parameter Errors 346 Blank Profile Panes or Data Panes 347 Changing a Calculation or Removing a Data Field Downstream 347 The Data Source Has Changed 348 Errors Within a Calculated Field 349 Incomplete Calculations 349 Unsupported Functions 351 Summary 351 Part VII. Managing Your Data 42. Documenting Your Data Preparation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353 Basic Documentation 353 Folder Structure 353 Filenames 354 Data Sources 354 Output 354 Step Names 355 Clean Step 355 Step Descriptions 356 Color 356 Joins 357 Unions 357 Summary 359 43. Deciding Where to Prepare Your Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360 Processes to Consider 360 Data Preparation Versus Visual Analytics 361 Data Literacy 361 Organization Size 361 Quality of Technological Hardware 361 History of Data Investment 362 Software Performance 362 Sampling 362 Functionality 363 Documentation 364 Summary 365 44. Managing Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366 What Is Sensitive Data? 366 Public 367 Confidential 367 Strictly Confidential 367 Restricted 367 Managing Data Based on Sensitivity 368 Production Versus Development Environments 368 Deleting Data 369 When Data Becomes Outdated or Irrelevant 369 When a Customer or Client Leaves 369 Summary 369 45. Storing Your Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 370 Inaccessibility 370 Don’t Break the Law 371 Don’t Delete Operational Data 371 Do Grant Access to Data for the Experts 371 Do Document Your Sources 372 Slow/Unresponsive Performance 372 Overwriting Risks 372 Grant Read-Only Access 373 Train Before Publishing 373 So, Where Do You Write That Output? 373 Summary 374 46. Using Identifiers and Keys in Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375 What Is an Identifier? 375 What Is a Key in a Database? 376 Using Keys and Identifiers in Prep 377 Creating Identifier Data Fields in Prep Builder 379 Summary 382 47. Keeping Your Data Up-to-Date. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383 Refreshing Data 383 Full Versus Incremental Refreshes 384 Setting Up Different Types of Refresh 384 Full Refresh 384 Incremental Refresh 385 What to Watch Out for When Refreshing Data Sources 388 Changing Data Values 388 Altering the Structure of Sources 388 New Data, New Input 388 Summary 389 48. Using History Tables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390 Why Are History Tables Required? 390 What to Consider When Creating History Tables 391 Ability to Join to Live Data 391 Relevance of Information 391 Frequency of Updates 392 Level of Granularity 392 Performance 392 Data Regulations 393 An Example History Table 393 Summary 397 49. Evaluating Whether You Need Prep Builder at All. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 398 A History of Data Preparation in Tableau 398 Where to Try Desktop First 399 Simple Joins 399 Unions 400 Single Pivots 402 Where to Start with Prep Builder 403 Summary 404 50. Final Thoughts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405 Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 407
Allchin C. / Оллчин К. - Tableau Prep: Up & Running / Tableau Prep: настраиваем и запускаем [2020, PDF, ENG] download torrent for free and without registration
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum You cannot attach files in this forum You can download files in this forum