Blog posts tagged
"Big Data"

26 posts


Giulia Lanzafame
26 June 2025

Accelerating data science with Apache Spark and GPUs

Article Data Platform

Apache Spark has always been very well known for distributing computation among multiple nodes using the assistance of partitions, and CPU cores have always performed processing within a single partition.  What’s less widely known is that it is possible to accelerate Spark with GPUs. Harnessing this power in the right...

Giulia Lanzafame
26 June 2025


Giulia Lanzafame
10 June 2025

Apache Spark security: start with a solid foundation

Article Data Platform

Everyone agrees security matters – yet when it comes to big data analytics with Apache Spark, it’s not just another checkbox. Spark’s open source Java architecture introduces special security concerns that, if neglected, can quietly reveal sensitive information and interrupt vital functions. Unlike standard software,...

Giulia Lanzafame
10 June 2025


Giulia Lanzafame
10 December 2024

Spark or Hadoop: the best choice for big data teams?

Article Data Platform

I always find the Olympics to be an unusual experience. I’m hardly an athletics fanatic, yet I can’t help but get swept up in the spirit of the competition. When the Olympics took place in Paris last summer, I suddenly began rooting for my country in sports I barely knew existed. I would spend random

Giulia Lanzafame
10 December 2024


robgibbon
23 May 2024

Can it play Doom? Running an AI LAN party on a Spark cluster with ViZDoom

Article AI

It’s all about AI these days, so I decided to try and answer the important question: can you make a Spark cluster run AI agents that play a game of Doom, in a multiplayer LAN party? Although I’m no data scientist, I was able to get this to work and I’ll show you how so

robgibbon
23 May 2024


robgibbon
22 February 2024

Migrating from Cloudera to a modern data hub architecture

Article Data Platform

In the early 2010s, Apache Hadoop captured the imagination of the tech community. A free and powerful open source platform, it gave users a way to process unimaginably large quantities of data, and offered a dazzling variety of tooling to suit nearly every use case – MapReduce for odd jobs like processing of text, audio

robgibbon
22 February 2024


robgibbon
17 October 2023

Why we built a Spark solution for Kubernetes

Article Data Platform

We’re super excited to announce that we have shipped the first release of our solution for big data – Charmed Spark. Charmed Spark packages a supported distribution of Apache Spark and optimises it for deployment to Kubernetes, which is where most of the industry is moving these days. Reimagining how to work with big data

robgibbon
17 October 2023


Canonical
17 October 2023

Canonical announces supported solution for Apache Spark® on Kubernetes

Article Canonical announcements

17 October 2023 Today, Canonical announced the release of Charmed Spark – an advanced solution for Apache Spark® that provides everything users need to run Apache Spark on Kubernetes.  Apache Spark is suitable for use in diverse data processing applications including predictive analytics, data warehousing, machine...

Canonical
17 October 2023


robgibbon
10 August 2023

Write a Spark big data job with ChatGPT

Article AI

I’ve read and watched more than a few articles about ChatGPT in the last couple of months. It seems the large language model AI hype machine just can’t stop.  As somebody with a passion for music production, some of the more interesting things I’ve seen included a guy using ChatGPT to build a virtual effect

robgibbon
10 August 2023


robgibbon
3 July 2023

Charmed Spark beta release is out – try it today

Article AI

The Canonical Data Fabric team is pleased to announce the first beta release of Charmed Spark, our solution for Apache Spark. Apache Spark is a free, open source software framework for developing distributed, parallel processing jobs. It’s popular with data engineers and data scientists alike when building data...

robgibbon
3 July 2023


robgibbon
3 May 2023

Big data security foundations in five steps

Article Data Platform

We’ve all read the headlines about spectacular data breaches and other security incidents, and the impact that they have had on the victim organisations. And in some ways there’s no place more vulnerable to attack than a big data environment like a data lake.

robgibbon
3 May 2023


robgibbon
16 November 2022

Apache Kafka service design for low latency and no data loss

Article Apps

Designing a production service environment around Apache Kafka that delivers low latency and zero-data loss at scale is non-trivial. Indeed, it’s the holy grail of messaging systems. In this blog post, I’ll outline some of the fundamental service design considerations that you’ll need to take into account in order to...

robgibbon
16 November 2022


robgibbon
31 August 2022

Kubernetes operators – the top 5 things to watch for

Article Charms

Software operators are steadily revolutionising how we deploy and run complex distributed systems. They offer the promise of low-intervention, self-driving software – ideally leading to service reliability gains and better uptime. For an introduction to Kubernetes operators, check out our introductory webinar or...

robgibbon
31 August 2022


robgibbon
6 December 2021

Canonical Data Platform 2021 winter roundup

Article AI

Canonical Data Platform: that was 2021 It’s that time of the year again: many folks are panic buying cans of windscreen de-icer spray and thermal underwear, bringing pine trees into the front room and preparing to enjoy an extended break with the family. So we thought to ourselves, what better time than now to take

robgibbon
6 December 2021


robgibbon
10 November 2021

SQL Server on Ubuntu Pro: bringing it all back home

Article Cloud and server

Not going to lie, the Microsoft SQL Server is my all-time favourite Microsoft product. For a long time, SQL Server was only available for Windows, but not much is really sacred. So now Microsoft, in collaboration with Canonical, are distributing and supporting several flavours of SQL Server on Ubuntu Pro for Azure. I’m...

robgibbon
10 November 2021


  1. Previous page
  2. 1
  3. 2
  4. Next page