AWS-Blog: Building Data Aggregation Pipelines using Apache Airflow and Athena
Business insights are frequently generated from aggregated data, like daily sales per market segment over time. In this blog post we’ll use Apache Airflow to build a data aggregation pipeline that utilizes Amazon Athena for the heavy lifting. We’ll cover best practices that you should follow to build a production-ready system.
AWS-Blog: How to accidentally create read-only DynamoDB items
In a recent Developing on AWS course I was faced with an interesting question about DynamoDB. What happens if you create an item that features attributes of a global secondary index with a data type that doesn’t match the index? My intuition was wrong, let’s check out what actually happens.
AWS-Blog: Making the TPC-H dataset available in Athena using Airflow
The TPC-H dataset is commonly used to benchmark data warehouses or, more generally, decision support systems. It describes a typical e-commerce workload and includes benchmark queries to enable performance comparison between different data warehouses. I think the dataset is also useful to teach building different kinds of ETL or analytics workflows, so I decided to explore ways of making it available in Amazon Athena.
AWS-Blog: Enabling Apache Airflow to copy large S3 objects
If you’re trying to use Apache Airflow to copy large objects in S3, you might have encountered issues where S3 complains about you sending an InvalidRequest. We will fix that in this post by writing a custom operator to handle the underlying problem.
AWS-Blog: You can't Opt-Out of Performance Tracking in the AWS Console
Even though I had opted out of performance measurement cookies, I noticed a lot of web requests that look like performance measurement in the AWS console. In this article I investigate what’s being sent and what we can do about it.
AWS-Blog: Improving Accessibility by Generating Image-alt texts using GenAI
In this article, we’ll be using GenAI to generate alternative texts for images in Markdown documents, which will help people relying on screen readers to access your content.
Bag in Black
There are things whose price doesn’t seem proportionate to their usefulness. Simple things. Common things. One such item, which I bought from a drugstore in Germany a couple of years ago, is a reusable shopping bag. It’s a simple bag, made, I think, out of polyester and comes with a smaller bag to carry it in. I paid maybe 2€ for it. Since then I’ve been using it 3-4 times per week to carry around all kinds of stuff....
Installing Apache Airflow on MacOS
I’m currently diving a bit deeper into Apache Airflow and want to further my understanding of the system. I chose to install it locally on my Mac because a managed service like Managed Workflows for Apache Airflow (MWAA) on AWS limits how much I can tinker with the system. For anything remotely production-related, I’d still go with the managed service. I used the Airflow: Getting Started documentation to do exactly that, getting started....
Casio F-91W: a simple watch
A few years I got a Casio F-91W watch (Wikipedia). I was looking for something that can double as a watch and stopwatch while explicitly not being a smartwatch. I wanted something light and sturdy. I was also looking for something inexpensive as I wasn’t sure if I was a watch person since I hadn’t worn any kind of jewelry in the broadest possible sense of that word for many years....
AWS-Blog: Going on an Industry Quest: Manufacturing and Auto
Using Industry Quest: Manufacturing and Auto you can learn about building IoT and factory management solutions in AWS. It’s a game that teaches you about real time monitoring, predictive maintenance, machine learning and data analytics. This blog gives an introduction to the game and covers my thoughts about its usefulness.