Code Optimization: Filtering dataframes using exact matches in multiple columns

Filtering medium to large amounts of data to extract a relevant subset is a very common task in any data related project. Often we do this on the basis of pandas dataframes. In this post I want to compare some filtering options for exact matches across multiple columns. The idea is pretty simple. We have a dataframe with multiple columns and rows as well as a list of conditions by which we want to extract data from it....

2023-11-17 · 8 min · Maurice Borgmeier

Code Optimization: Finding the correct spot on the leaderboard

Imagine you’re running a sports competition with multiple competitions going on and you need to keep track of the top 10 fastest scores across all disciplines. As each athlete finishes competing in one or more games they want to know what their spot on the leaderboard is. What’s the fastest way to compute this across a range of competitions? Given a n x m matrix like you can see below where the rows are the disciplines and columns the top 10 spots, figure out where player p ranks in all disciplines based on their times....

2023-10-28 · 8 min · Maurice Borgmeier

Even more efficient hashing of columns in a pandas dataframe

One of the joys of software development is that small changes can sometimes make solving the same problem orders of magnitude faster. Revisiting previous solutions with more experience can lead to even better results. I show you how I improved the previous implementation by a factor of 2.7.

2022-12-20 · 5 min · Maurice Borgmeier

Efficiently hashing columns in a pandas dataframe

One of the joys of software development is that small changes can sometimes make solving the same problem orders of magnitude faster. I experienced this recently when implementing a function to generate a hash over multiple columns in a dataframe. Today I’m going to show you how I came up with that solution.

2022-09-18 · 9 min · Maurice Borgmeier

CLI Text styling, Progress Bars and more with Python and Click

In this article we’re going to take a look at some of the quality of life improvements Click offers to build command line interfaces. There’s a couple of things it makes a lot easier for users compared to building your own CLI - e.g. text styling or progress bars.

2020-08-24 · 11 min · Maurice Borgmeier

Advanced CLI structures with Python and Click

I’m going to show you some of the more advanced patterns and uses for the Python CLI building library Click. These include custom input validation, multi-level CLIs and command groups.

2020-08-17 · 8 min · Maurice Borgmeier

Building CLIs with Python and Click

In this article I’m going to give you an introduction to the Python library Click which allows you to build pretty command line interfaces in no time.

2020-08-12 · 8 min · Maurice Borgmeier