Summary Most developers have encountered code completion systems and rely on them as part of their daily work. They allow you to stay in the flow of programming, but have you ever stopped to think about how they work? In this episode Meredydd Luff takes us behind the scenes to dig into the mechanics of code completion engines and how you can customize them to fit your particular use case. Announcements Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science. When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Your host as usual is Tobias Macey and today I’m interviewing Meredydd Luff about how code completion works and what it takes to build your own Interview Introductions How did you get introduced to Python? Most programmers are familiar with the idea of code completion, but can you just give the elevator pitch to get us all on the same page? You gave a presentation recently at PyCon about how to build a code completion system. What was your approach to identifying what fundamental concepts needed to be addressed and how to fit that lesson into the available time? In the presentation you mentioned that you had built a more full-featured completion engine into Anvil. Can you describe what possessed you to build your own code completion tool? What are the core components required to build a completion engine? What are the benefits that can be realized by customizing the completion engine for a given language or task? Can you describe the feature set and implementation details of the full-fledged completion engine that is available in Anvil? Beyond the toy example, there are a number of considerations to address if you want to make the completion engine "production grade". Can you talk through some of the obvious edge cases and how to solve for them? (e.g. handling parsing of incomplete code) What are the inputs that you use to build up the list of candidate tokens for completion? Once you have a functioning baseline for offering completions, what are some of the signals that you hook into for ranking suggestions? In your presentation you leaned on the machinery available in the Python standard library. What are some of the ways that you might think about generalizing across languages vs. coupling to a given language? What design/architectural advice do you have for compartmentalizing logic in a full-featured completion engine? What are some of the complexities that become a factor when you are trying to scale across an entire code base? Beyond just being able to parse and process a body of code, there is also the question of integrating with the development environment. What are some of the challenges that get introduced when trying to access the appropriate set(s) of files and code through the editor interface(s)? What are the most interesting, innovative, or unexpected ways that you have seen code completion applied to developer experience? What are the most interesting, unexpected, or challenging lessons that you have learned while working on code completion for Anvil? When is code completion more effort than it’s worth? What do you have planned for the future of the Anvil code completion functionality? Keep In Touch LinkedIn meredydd on GitHub @meredydd on Twitter Picks Tobias "Weird Al" Yankovic Meredydd TimescaleDB Data Engineering Podcast Episode Promscale Closing Announcements Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@podcastinit.com) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links PyCon presentation about building a completion engine Anvil Podcast Episode Nano Language Server Protocol Jedi Podcast Episode Skulpt Parser Abstract Syntax Tree OpenAPI GitHub Copilot Halting Problem Parser Generator Python Language Grammar Definition Lezer Parser Generator Tree-sitter PyScript Grafana Tempo Tracing Service The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA