DuckDB: Up and Running: Fast Data Analytics and Reporting / DuckDB: запущена и работает: Быстрый анализ данных и составление отчетов Год издания: 2025 Автор: Lee Wei-Meng / Ли Вэй-Мэн Издательство: O’Reilly Media, Inc. ISBN: 978-1-098-15969-6 Язык: Английский Формат: PDF/EPUB Качество: Издательский макет или текст (eBook) Интерактивное оглавление: Да Количество страниц: 308 Описание: DuckDB, an open source in-process database created for OLAP workloads, provides key advantages over more mainstream OLAP solutions: It's embeddable and optimized for analytics. It also integrates well with Python and is compatible with SQL, giving you the performance and flexibility of SQL right within your Python environment. This handy guide shows you how to get started with this versatile and powerful tool. Author Wei-Meng Lee takes developers and data professionals through DuckDB's primary features and functions, best practices, and practical examples of how you can use DuckDB for a variety of data analytics tasks. You'll also dive into specific topics, including how to import data into DuckDB, work with tables, perform exploratory data analysis, visualize data, perform spatial analysis, and use DuckDB with JSON files, Polars, and JupySQL. Understand the purpose of DuckDB and its main functions Conduct data analytics tasks using DuckDB Integrate DuckDB with pandas, Polars, and JupySQL Use DuckDB to query your data Perform spatial analytics using DuckDB's spatial extension Work with a diverse range of data including Parquet, CSV, and JSON DuckDB, технологическая база данных с открытым исходным кодом, созданная для рабочих нагрузок OLAP, обладает ключевыми преимуществами по сравнению с более распространенными OLAP-решениями: Она встраиваема и оптимизирована для аналитики. Он также хорошо интегрируется с Python и совместим с SQL, обеспечивая производительность и гибкость SQL прямо в вашей среде Python. В этом удобном руководстве показано, как начать работу с этим универсальным и мощным инструментом. Автор Вэй-Мэн Ли знакомит разработчиков и специалистов по обработке данных с основными функциями DuckDB, рекомендациями и практическими примерами того, как вы можете использовать DuckDB для различных задач анализа данных. Вы также познакомитесь с конкретными темами, в том числе с тем, как импортировать данные в DuckDB, работать с таблицами, выполнять поисковый анализ данных, визуализировать данные, выполнять пространственный анализ и использовать DuckDB с файлами JSON, Polars и JupySQL. Понимать назначение DuckDB и ее основные функции Выполнять задачи анализа данных с помощью DuckDB Интегрировать DuckDB с pandas, Polars и JupySQL Используйте DuckDB для запроса ваших данных Выполняйте пространственную аналитику с помощью пространственного расширения DuckDB Работайте с разнообразными данными, включая Parquet, CSV и JSON
Примеры страниц (скриншоты)
Оглавление
Preface ix 1. Getting Started with DuckDB 1 Introduction to DuckDB 2 Why Use DuckDB? 2 High-Performance Analytical Queries 4 Versatile Integration and Ease of Use Across Multiple Programming Languages 6 Open Source 7 A Quick Look at DuckDB 7 Loading Data into DuckDB 8 Inserting a Record 9 Querying a Table 9 Performing Aggregation 10 Joining Tables 12 Reading Data from pandas 14 Why DuckDB Is More Efficient 17 Execution Speed 17 Memory Usage 20 Summary 21 2. Importing Data into DuckDB 23 Creating DuckDB Databases 23 Loading Data from Different Data Sources and Formats 24 Working with CSV Files 24 Working with Parquet Files 34 Working with Excel Files 39 Working with MySQL 44 Summary 48 3. A Primer on SQL 51 Using the DuckDB CLI 51 Importing Data into DuckDB 54 Dot Commands 55 Persisting the In-Memory Database on Disk 59 DuckDB SQL Primer 61 Creating a Database 62 Creating Tables 63 Viewing the Schemas of Tables 64 Dropping a Table 64 Working with Tables 65 Populating Tables with Rows 65 Updating Rows 68 Deleting Rows 68 Querying Tables 69 Joining Tables 70 Aggregating Data 76 Analytics 78 Summary 81 4. Using DuckDB with Polars 83 Introduction to Polars 83 Creating a Polars DataFrame 84 Understanding Lazy Evaluation in Polars 93 Querying Polars DataFrames Using DuckDB 98 Using the sql() Function 98 Using the DuckDBPyRelation Object 103 Summary 107 5. Performing EDA with DuckDB 109 Our Dataset: The 2015 Flight Delays Dataset 110 Geospatial Analysis 111 Displaying a Map 112 Displaying All Airports on the Map 114 Using the spatial Extension in DuckDB 117 Performing Descriptive Analytics 127 Finding the Airports for Each State and City 128 Aggregating the Total Number of Airports in Each State 131 Obtaining the Flight Counts for Each Pair of Origin and Destination Airports 136 Getting the Canceled Flights from Airlines 138 Getting the Flight Count for Each Day of the Week 144 Finding the Most Common Timeslot for Flight Delays 150 Finding the Airlines with the Most and Fewest Delays 153 Summary 158 6. Using DuckDB with JSON Files 159 Primer on JSON 159 Object 160 String 160 Boolean 160 Number 161 Nested Object 161 Array 161 null 162 Loading JSON Files into DuckDB 163 Using the read_json_auto() Function 164 Using the read_json() Function 166 Using the COPY-FROM Statement 177 Exporting Tables to JSON 178 Summary 179 7. Using DuckDB with JupySQL 181 What Is JupySQL? 182 Installing JupySQL 183 Loading the sql Extension 183 Integrating with DuckDB 184 Performing Queries 185 Storing Snippets 188 Visualization 190 Histograms 191 Box Plots 196 Pie Charts 198 Bar Plots 200 Integrating with MySQL 204 Using Environment Variables 204 Using an .ini File 207 Using keyring 209 Summary 210 8. Accessing Remote Data Using DuckDB 211 DuckDB’s httpfs Extension 211 Querying CSV and Parquet Files Remotely 212 Accessing CSV Files 212 Table of Contents | vii Accessing Parquet Files 216 Querying Hugging Face Datasets 220 Using Hugging Face Datasets 221 Reading the Dataset Using hf:// Paths 224 Accessing Files Within a Folder 225 Querying Multiple Files Using the Glob Syntax 228 Working with Private Hugging Face Datasets 231 Summary 243 9. Using DuckDB in the Cloud with MotherDuck 245 Introduction to MotherDuck 246 Signing Up for MotherDuck 246 MotherDuck Plans 249 Getting Started with MotherDuck 250 Adding Tables 252 Creating Schemas 255 Sharing Databases 257 Creating a Database 263 Detaching a Database 263 Using the Databases in MotherDuck 264 Querying Your Database 264 Writing SQL Using AI 270 Using MotherDuck Through the DuckDB CLI 274 Connecting to MotherDuck 274 Querying Databases on MotherDuck 278 Creating Databases on MotherDuck 279 Performing Hybrid Queries 281 Summary 283 Index 285
Lee Wei-Meng / Ли Вэй-Мэн - DuckDB: Up and Running / DuckDB: запущена и работает [2025, PDF/EPUB, ENG] download torrent for free and without registration
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum You cannot attach files in this forum You can download files in this forum