Skip to content

Latest commit

 

History

History
117 lines (70 loc) · 9.05 KB

preface.asciidoc

File metadata and controls

117 lines (70 loc) · 9.05 KB

Preface

We wrote this book for developers and data scientists looking to build and scale applications in Python without becoming systems administrators. We expect this book to be most beneficial for individuals and teams dealing with the growing complexity and scale of problems moving from single-threaded solutions to multi-threaded, all the way to distributed computing.

While you can use Ray from Java, this book is in Python, and we assume a general familiarity with the Python ecosystem. If you are not familiar with Python, some excellent O’Reilly titles include Learning Python and Python for Data Analysis.

Serverless is a bit of a buzzword, and despite its name, it does contain servers, but the idea is you don’t have to manage them explicitly. For many developers and data scientists, the promise of having things magically scale without worrying about the servers' details is quite appealing. On the other hand, if you enjoy getting into the nitty-gritty of your servers, deployment mechanisms, load-balancers, etc., this is probably not the book for you — but hopefully, you will recommend this to your colleagues.

What you will learn

In reading this book you will learn how to take your existing Python skills to make programs capable of scaling beyond a single machine. You will learn about different techniques for distributed computing, from remote procedure calls to actors all the way to distributed data sets and machine learning. We wrap up this book with a "real-ish" world example in [appA] that uses many of the techniques to build a scalable backend, while integrating with a Python based web-application and deploying on Kubernetes.

A note on responsibility

As the saying goes, with great power comes great responsibility. Ray, and tools like it, enable you to build more complex systems handling more data and users. It’s important not to get too excited and carried away solving problems because they are fun, and stop to ask yourself what the impact of your decisions will be. You don’t have to search very hard to find stories of well-meaning engineers and data scientists accidentally building models or tools that caused devastating impacts, like the new VA payment system breaking, to hiring algorithms that discriminate based on gender. We ask that you use your newfound powers keeping this in mind, for one never wants to end up in a textbook for the wrong reasons.

Conventions Used in This Book

The following typographical conventions are used in this book:

Italic

Indicates new terms, URLs, email addresses, filenames, and file extensions.

Constant width

Used for program listings, as well as within paragraphs to refer to program elements such as variable or function names, databases, data types, environment variables, statements, and keywords.

Constant width bold

Shows commands or other text that should be typed literally by the user.

Constant width italic

Shows text that should be replaced with user-supplied values or by values determined by context.

Tip

This element signifies a tip or suggestion.

Note

This element signifies a general note.

Warning

This element indicates a warning or caution.

License

Once published in print and excluding O’Reilly’s distinctive design elements (i.e. cover art, design format, “look and feel”) or O’Reilly’s trademarks, service marks, and trade names, this book is available under a Creative Commons Attribution-Noncommercial-NoDerivatives 4.0 International Public License. We thank O’Reilly for allowing us to make this book available under a creative commons license. We hope that you will choose to support this book (and the authors) by purchasing several copies of this book with your corporate expense account (it makes an excellent gift for whichever holiday season is coming up next).

Using Code Examples

The Scaling Python ML Github contains most of the examples for this book. The examples in this book are in the "ray" directory, with some parts (namely Dask on Ray) being found in the "dask" directory and Spark on Ray in the "spark" directory.

If you have a technical question or a problem using the code examples, please send email to [email protected].

This book is here to help you get your job done. In general, if example code is offered with this book, you may use it in your programs and documentation. You do not need to contact us for permission unless you’re reproducing a significant portion of the code. For example, writing a program that uses several chunks of code from this book does not require permission. Selling or distributing examples from O’Reilly books does require permission. Answering a question by citing this book and quoting example code does not require permission. Incorporating a significant amount of example code from this book into your product’s documentation does require permission.

We appreciate, but generally do not require, attribution. An attribution usually includes the title, author, publisher, and ISBN. For example: “Scaling Python with Ray by Holden Karau and Boris Lublinsky (O’Reilly). Copyright 2023 Holden Karau and Boris Lublinsky, 978-1-098-11880-8.”

If you feel your use of code examples falls outside fair use or the permission given above, feel free to contact us at [email protected].

O’Reilly Online Learning

Note

For more than 40 years, O’Reilly Media has provided technology and business training, knowledge, and insight to help companies succeed.

Our unique network of experts and innovators share their knowledge and expertise through books, articles, and our online learning platform. O’Reilly’s online learning platform gives you on-demand access to live training courses, in-depth learning paths, interactive coding environments, and a vast collection of text and video from O’Reilly and 200+ other publishers. For more information, visit https://oreilly.com.

How to Contact Us

Please address comments and questions concerning this book to the publisher:

  • O’Reilly Media, Inc.
  • 1005 Gravenstein Highway North
  • Sebastopol, CA 95472
  • 800-998-9938 (in the United States or Canada)
  • 707-829-0515 (international or local)
  • 707-829-0104 (fax)

We have a web page for this book, where we list errata, examples, and any additional information. You can access this page at https://oreil.ly/scaling-python-ray.

Email [email protected] to comment or ask technical questions about this book.

For news and information about our books and courses, visit https://oreilly.com.

Follow us on Twitter: https://twitter.com/oreillymedia

Acknowledgments

We would like to acknowledge the contribution of Carlos Andrade Costa who cowrote with us [ch08] - Ray Workflows. This book would not exist if not for the communities it is built on. Thank you to the Ray/Berkeley community and the PyData community. Thank you for your contributions and guidance to all the early readers and reviewers. These reviewers include Dean Wampler, Jonathan Dinu, Adam Breindel, Bill Chambers, Trevor Grant, Ruben Berenguel, Michael Behrendt, and many more. A special thanks to Ann Spencer for reviewing the early proposals of what eventually became this and Scaling Python with Dask. Any remaining mistakes are the authors' fault, sometimes against the advice of our reviewers.

From Holden

I would also like to thank my wife and partners for putting up with my long in-the-bath-tub writing sessions. A special thank you to Timbit for guarding the house and generally giving me a reason to get out of bed (albeit often a bit too early for my taste).

spwr 00in01

From Boris

I would also like to thank my wife, Marina, for putting up with long writing sessions and sometimes neglecting her for hours, and my colleagues in IBM for many fruitful discussions, that helped him to better understand the power of Ray.