Hosting Hachyderm

2024/11/15

Ship It! Cloud, SRE, Platform Engineering

People

Adam Jacob

Avthar Suwathan

Dave Rosenthal

Justin Garrison

Preston Doster

Topics

Preston Doster: Hachyderm 的架构经历了从单服务器到多服务器的演变，以应对用户数量的快速增长。其基础设施架构自下而上，由数据库（Postgres）、连接池（PG Bouncer）、缓存（Redis）、任务队列（Sidekiq）等组成。为了提高性能和可靠性，Hachyderm 采用了多云策略，并使用 CDN 来缓存媒体文件。Hachyderm 的运营成本主要用于媒体存储和 CDN 服务。Hachyderm 的管理团队由核心成员和志愿者组成，团队规模相对较小，部署方式相对简单，目前主要依靠手动操作，而不是 Kubernetes 等复杂技术。Hachyderm 使用 Uptime Robot 和 Discord 来监控服务状态并进行告警，并提供公共 Grafana 仪表板用于监控关键指标。Sidekiq 是 Hachyderm 未来可能面临瓶颈的组件之一，需要考虑进一步优化和扩展。Hachyderm 进行每周一次的完整备份和每日一次的增量备份，以确保数据安全。选择在德国托管 Hachyderm 是为了遵守 GDPR 法规，并获得更强的法律保护，避免因内容问题而面临政府审查或处罚。Hachyderm 需要考虑如何平衡存储成本和用户体验，例如对历史数据的存档策略。 Justin Garrison: Hachyderm 以其公开的数据和对基础设施运行方式的透明度而闻名。对 Hachyderm 的架构和运营模式表示赞赏。 Autumn Nash: 对 Hachyderm 的发展历程和运营模式表示赞赏，特别是 Chris Nova 在早期运营中的贡献。 Dave Rosenthal: Sentry 的指标系统与追踪系统相连接，以便在深入挖掘实际问题时获得丰富的调试上下文。 Adam Jacob: System Initiative 是一款 DevOps 自动化工具，使用可视化架构图来简化基础设施管理。 Avthar Suwathan: Timescale 构建云端和开源工具，帮助开发者更好地使用 Postgres 进行时间序列分析和 AI 应用开发。

Deep Dive

Key Insights

Why did Hackaderm move from a single server in Chris's basement to a more complex infrastructure?

The initial server, Alice, couldn't handle the rapid growth from 500 users to 36,000 during the Twitter exodus. The move was necessary to scale properly and avoid legal issues with the ISP.

What are the key components of Hackaderm's infrastructure?

The core components include a Postgres database on a Hetzner metal server, PG Bouncer for connection pooling, Redis for caching, Sidekiq for handling background jobs, and Puma for the web application. Media storage is handled by DigitalOcean Spaces, and a CDN is used for caching media files.

How does Hackaderm handle federation and scaling challenges?

Federation is managed through Sidekiq, which processes incoming and outgoing events across the Fediverse. Scaling challenges are primarily focused on Sidekiq queues, which can get backed up during high activity periods. The team is also working on redundancy for the Postgres database and web server.

What is the cost breakdown for running Hackaderm?

Hackaderm costs around $1,000 per month, with the majority of the expenses going toward media storage. DigitalOcean provides the storage at a discounted rate, but the CDN costs are significant.

How does Hackaderm fund its operations?

Hackaderm is part of the Nivenly Foundation, which receives donations and sponsorships. Contributions from individuals and organizations support the server's infrastructure and operations.

What are the main scaling challenges for Hackaderm?

The primary scaling challenges are related to Sidekiq, which handles federation tasks. The team is exploring auto-scaling options for Sidekiq queues to handle spikes in activity. The database also requires redundancy to ensure failover capabilities.

How does Hackaderm handle moderation and content filtering?

Hackaderm allows individual account blocks and server-wide defederation. The moderation team is critical for maintaining a healthy community. The server also subscribes to defederation lists to block problematic content from other servers.

What are the long-term risks for Hackaderm?

The main long-term risks include the rising cost of media storage and potential changes in legal jurisdictions. The team is also concerned about the sustainability of infinite storage and the impact on user experience.

How does Hackaderm handle backups and disaster recovery?

Hackaderm performs weekly full backups and daily incremental backups, retaining data for 21 days. The backups are stored in a way that allows for quick recovery in case of a disaster.

What is the current user base of Hackaderm?

Hackaderm has approximately 55,000 accounts, with around 11,000 monthly active users. The server has seen steady growth since its inception.

Chapters

This chapter sets the stage by introducing the topic of running a large Mastodon server and the challenges involved. It highlights the decentralized nature of Mastodon and its scaling capabilities. The discussion also touches upon the history of web rings as a precursor to modern recommendation engines.

Introduction to running a large Mastodon server
Decentralized nature of Mastodon
Scaling capabilities of Mastodon
History of web rings

Shownotes Transcript

Translations:

中文

One.

This is Ship It with Justin Garrison and Autumn Nash. If you like this show, you will love The Change Log. It's software news on Mondays, deep technical interviews on Wednesdays, and on Fridays, an awesome talk show for your weekend enjoyment. Find it by searching for The Change Log wherever you get your podcasts. Ship It is brought to you by Fly.io. Launch your app in five minutes or less. Learn how at Fly.io. ♪

Hey, friends. I'm here with a friend of mine, Dave Rosenthal, CTO at Sentry. So, Dave, I've heard you say trace connected before. I know that the next big frontier for Sentry is tracing this metrics platform. How'd you get there? And what do developers need to know about how you're thinking about this product? Before I came to Sentry, Sentry was sort of working on a metrics product.

We started building that metrics product in a more traditional way with the metrics just kind of being more like kind of just disconnected. They're just like another source of data. They're another table somewhere. Yeah, you can line them up by time. Sure, you can drill into them a little bit, but really they weren't connected to that trace. And we took a big step back from that after trying it ourselves, after trying it with users.

and realizing that there was a whole class of things we wanted to be able to do that you couldn't do with this kind of disconnected metrics. And so, you know, we changed our APIs, we changed our approach, and we're kind of now really on a very clear direction of building a metric system that isn't one of these kind of like legacy disconnected metric systems. It's trace connected.

so that we can get that kind of rich debugging context when you actually dig into a real problem. And, you know, there's trade-offs. It's not quite as easy to just like log random metrics at random times into a system. You have to put the thought when you're building the telemetry into how this metric actually does relate to the structure of the code that's running underneath.

But we think it's the right tradeoff for users because it's like a little extra time to figure that connection out. But then when you actually go to use the data, the connection's there for you. So small investment, big return. And it's just an example of how we're kind of making decisions internally to set our users up for success with this trace connected idea. Very cool. It's cool to see how you think about getting to the end result with any of these projects you're building.

It's so cool to see behind the scenes the way you think and the way you iterate. Okay, friends, go to Sentry.io. Use our code CHANGELOG to get $100 off the team plan. That's basically almost four months for free. Sentry.io. Use the code CHANGELOG. Don't use anything else. Use our code. It's good for you. Once again, Sentry.io.

Hello and welcome to Ship It, the podcast all about what happens after you get push. I'm your host, Justin Garrison, and with me as always is Autumn Nash. How's it going, Autumn? Hey, everyone. Today on the show, we have a conversation I've been looking forward to for quite a while because I've been interested in...

and Massadon and specifically like how that infrastructure runs and how to scale that and just how the servers talk to each other. And I like decentralized things in the non-scammy financial ways. So it was kind of something that I- I love that you had to be specific about that. Yeah.

Like I love the internet, right? And the internet is the most decentralized system you can get and it scales really well for a lot of reasons. And then I see this like Mastodon thing that's like, hey, we can kind of do that internet thing, but like portable on social networks and let people connect. I've been blogging for a very long time. I used to have a list of web ring sites that I also would send people to. And people are old enough to remember web rings. Wait, what's web ring? Web rings weren't like a way to communicate between sites, but it was a way to like list out like sites you're also reading.

It was like a recommendation engine for people that had websites. They're like, I have a blog. If you like my blog, you probably like these other people's blogs. And they would link to other people. So it was like a discovery engine before things had discovery engines. It was all personally curated. And then no one ever updated them. You're like my favorite old best friend. Like...

It's weird because I never grew up with a computer. I know. I've had a computer since I was like, like I fell in love with computers in elementary school. And you're like, I got one in college. How, sir? Like, did you just not come out of your dorm room at some point? Like, did you just lock yourself in there and you found every... How did you have time to date Beth? Like, it is a miracle that you got married. We met because I fixed her computer. I didn't tell you that? No, yeah. It is the cutest thing.

I needed a job in college and I got a job because there was a lab across the hall. The dorm room had a lab, which is what I used all day every day because I had no computer myself. Can we talk about how cute and nerdy that is? You fixed her computer and then you were married for 20 years. I didn't fix her computer though. I worked on her computer. No, I didn't sabotage it. It was a whole thing where like- He was like, she has to come back if it doesn't turn on ever. I don't know.

I worked at the IT departments. It was a student IT run thing. So we would repair computers. We also were one of the first colleges that had Wi-Fi throughout the whole school. And so every year when people would come, they would come with desktops. No one had laptops in 2001. And so they would come with desktops and like, I need to get on the network. We're like, we're going to install a Wi-Fi card for you. You went to college when there were no laptops?

Yeah. People, we, we had a laptop rental program so that people could rent laptops if they wanted to. And those came with optional wifi. It was like, yeah, it was the thing, but she came with a desktop. That was an old, uh, what's Mac tell. Like she carried a desktop into your, everyone moved to desktop. Everyone came with desktops and she needed a wifi card to connect. And it was in our shop or whatever. And then we're like, Oh, we need to go deliver this to the customer. We have to go bring it back to the dorm room. And they're like, okay, I'll, I was working the time.

"Hey, where is it?" And like, "Oh, it's someone else knew her." Someone's like, "Oh, it's Beth."

I was like, who's this? Like, I don't know where this person is like, oh yeah, she has a shaved head. And I'm like, I've seen her around. I want to meet her. Can I please bring that computer back? And I literally was like, let me take over this ticket. I delivered the computer and I did none of the work. Okay. But this is the cutest thing I've ever heard of. He was like, bro, this is my ticket. Like, don't even, don't even look at the ticket. Like I'm going, I'm pointing over on my left and I can see the computer. I still have the computer, the computer, the reason we met, I still have the computer.

It is a terrible knockoff Mac when Apple used to license their light operating system. This is like the cutest thing ever. Like this is the most warm and fuzzy. It's like, okay, I believe in love again. Like that is how we met because I didn't fix her computer. I delivered her computer after it was fixed. That is the most adorable story ever. Like it's so cute. Yeah.

I have like a small tear like right there. Like it's so warm and fuzzy. Like that is the cutest nerd story ever. Way before Bumble and dating apps. Damn it. You mean I have to fix people's computer to find my soulmate? This is bull. If you can fix someone's printer, that might be like a new. I want someone who can fix my printer. When's your 3D printer coming? Never. Sorry. Sorry, Beth.

Anyway. That's not what we're going to talk about. I'm so bitter. Do you see the filament on the ground? Do you see it? Do you see it? It's just haunting me. It's like tons of filament, but you're sad and you have no 3D printer. You're over there with a blow dryer trying to melt it into some sort of figure. Anyway.

Sorry. That's not what you heard. Preston Duster is on the call. You are a infrastructure architect at Twilio, which is not at all what we want to talk about. No, not today. I have questions. So what does Twilio do? What does Twilio do? So I would call us a telecommunications company, right? So we kind of grew up by building APIs that abstract the complexity of doing things like making phone calls or sending SMS.

We've grown into a lot of different channels now and into almost like the contact center space. Like who uses Twilio and for like what, maybe like, I'm a paying customer just FYI. Yeah. Yeah. Yeah. So we started out really marketing to engineers, right? So people who wanted to build apps to send say SMS, right. And what we've grown into is the C pass or, uh,

customer service platform really where it's all about so like when you order from Uber Eats or DoorDash or whoever and you get that text that says like hey your your dashers is waiting for your order right if it sends it to you by SMS there's a potential that's going through Twilio right because they would use our API to abstract all that complexity it too far yeah exactly

The thing that I've been wanting, and I've been paying for a Twilio number for years because I generated a number and it's like a buck a year. He's like, thank you. It's like, yeah, like you keep the phone number for a year. And it was like an LA, a 323 number. And it was a good, I remember what the number was. I'd have to look it up now, but I'm just like every year I'm like, yeah, I'll pay the dollar. I'll keep it. It's like a domain name basically. I love how engineers just collect numbers.

domains. Names got all the good ones, right? Yeah. But the thing I've been wanting forever from Twilio is iMessage or RCS support. And I know you, you can't talk about roadmaps or anything like that, but like I, I ended up going to another one, SendBlue. I'm pretty sure they're built on top of Twilio and they implemented the, the like iMessage stuff on top of like the base number handling. But like it does iMessage properly.

And so I was like, always like, well, I want like RCS. Like, I just want like straight, like now that Apple iOS supports RCS too, I'm like, how do I get a good like number RCS interaction? That's a little more portable than iMessage. Yeah.

Yeah, totally. I can send you some docs on RCS stuff. So is it working now? The last time I looked at it was last year. Yeah. I'll, I'll send you some stuff. Sweet man. Not another project. I got to finish that project. I shelved a year ago. Cool. But we want to talk about hacky term. You, you are part of the infrastructure team at hacky term.io, which is one of the largest Mastodon instances that I know of. Uh,

I know there are some big ones out there. I mean, obviously like Mastodon Social, I think is the largest still by probably orders of magnitude. I'm on your server. You're on Hackaderm? Yeah. And so I want to talk about like, you've had that for a little while now. I know Chris Nova and friends started it and it's been going for a long time. And one of the things I loved about how you're running it is how open you are with the data. And you're like, this is how this works. And this is how the metrics are. This is how we scale it. And this is...

all of that stuff is really cool to see from like the outside of like, Oh, cool. I don't have to run it to see your, your dashboards. Yeah, absolutely. Yeah, no, I was, I mean, I was super fortunate to cross paths with Chris while she was with us, you know, while I was at Twilio, you know, what was it? Probably about like two years ago now, right. When there was the, the large Twitter exodus, right. One of the first waves, right.

I know, it seems to be like another one every six months. Yeah, we're on wave seven now or something like that. But yeah, Chris had started Hackaderm literally running it out of her basement. There was an old Dell PowerEdge named Alice that got us through the first days of Hackaderm. And then we realized with that really big wave from Twitter, I think we went from something like 500 people to I think we had a peak of like 36,000 over the course of a handful of weeks.

So, you know, ridiculous growth, right? Which kind of prompted us to say, hey, we've actually got to get out of the basement and build some proper infrastructure, right? So like I said, I crossed paths with Chris and she was like, hey, you want to come help out? And, you know, the rest is kind of in history. So it's been fun riding the growth waves.

especially, you know, using purely open source software, purely decentralized software like Mastodon is. And yeah, just really learning it and how it scales and, you know, building our emperor and tuning it to serve the folks that are on it today. She was so cool watching her do like the different, she would do kind of like

I don't know if it would be an AMA, but she would talk about the different infrastructure and how she was building it and all that. That was the coolest part about Mastodon, just the different ways that she would be so open and talk about the different servers that she was using and

Like she would talk about how she's going to like do maintenance or, you know, all that cool stuff. And it was really cool that she would like just live stream it. She was just like, I'm just going to, I got some stuff to do. Let me jump on in front of a camera and we're going to talk through it. That was the best part of Macedon. That's probably like the only like few things that like I would look forward to on there. Hmm.

Yeah. So how, how has that like Macedon as you're using the like official Macedon instance, right? Because like there are forks, there are other options. I used one for a little while that Cloudflare wrote, um, wildebeest, uh, which was all like integrated into their workers and stuff like that. And I thought it was really cool. Cause it was like a quote unquote serverless, uh, Macedon. Uh, but it was like extremely difficult to up

upgrade and to change things. But you're using like the official Macedon software, which is a Ruby on Rails app with like a Postgres database, right? Yep. You got it. That's kind of the core. Yep. Yeah. And if anyone has ever run Ruby on Rails, you're going to deal with Sidekick. You're going to deal with PG Bouncer, things like that, or like the traditional ways of scaling, right? Can you describe that path of like, hey, we went from 500 users to 30, over 30,000.

Yeah, totally. What's awesome is you actually hit the main actors in the ecosystem. So really looking at it from the bottom up. We went from Alice, which was one server that was hosting everything,

The plan was then to break things out, right? And like you said, Postgres kind of at the bottom as the database. What we ended up doing there is we purchased and we actually run metal servers in Hetzner in Germany and their Frankfurt data center. So Postgres sits on that. So it's a metal server. It's quite large, plenty of spare capacity for growth, those sorts of things.

And there's supposed to be, and this is one of our big projects that we've got to go do, is build up our secondary replica, right? So that we've got some good failover, good redundancy there. But that's kind of at the bottom, right? And as you mentioned, as we go up the stack, PG Bouncer,

absolutely for connection pooling and making sure that we're not overloading the database in that sense. There are a pair of Redis's that do both like kind of ephemeral caching and a bit more like permanent type caching and that helps speed up interactions through the web server. As we go up from there, you mentioned Sidekiq. Sidekiq is really the heart of how all the decentralized stuff works. And if you can imagine at any point in time in the Fediverse,

So many events are happening, right? And you've got to both advertise events out as things are happening on your server. You've got to be able to pull events in that are happening in the world and process those in a reasonable amount of time. So lots of sidekick. So a good amount of our time and energy and some of the server capacity is just making sure those jobs are doing what they need to go do.

Right. And we've got a lot of different cues involved, a lot of parallelism involved there as well. And I think that at least from my understanding, that's like the core differences to me between something like Macedon and something like Blue Sky, where all of the heavy lifting of how the world is interacting is on individual Macedon instances.

where every single one of them, I like a post, I send an update, I do something that needs to go tell all of my, anyone I follow on different servers, and then I need to hear from all of those other people on all the different servers. So like on a single server, it seems like it's pretty constrained. Like if everyone only followed Hackaderm people on Hackaderm, you're probably fine, right? Like you're not doing a lot of network traffic. You're not doing like, I can do internal processes, do that stuff. But once you start having people

dozens of other servers or lots of people following other people on other instances that seems to cause a lot of extra stress on your instance. Absolutely. Yeah. Yeah. Yeah. Federations work. It's fun. Right. Because yeah, as you can imagine, like as that, as that social graph grows, you've got to interact with increasingly more and more end points around the world. Right. So like, and you know, we've got some folks who have some really large followings and

And what's fun is sometimes that causes spikes, right? Like you can go look at Grafana, you can go look at our queue depths and see where certain events are happening just because there's so much of that kind of chat and back and forth happening. I remember a long time ago,

It was Twitter had, they said like they had a whole dedicated server rack for Justin Bieber. I wouldn't be surprised. It was just like, it was like, like because of the architecture, whenever someone with a big following was sending out an updates, it caused so much traffic. They just had to scale it up. And there's like, we just had for a long time, they were on prem, they had bare metal stuff. And it was just like, we have a rack for Justin Bieber.

And that's just like his, it was just like, wow, that was amazing. Just to think of that, I was like, okay, how often is he interacting? Like, it doesn't matter. It's just like, cause we want that to go out to the millions of people that want to see the thing he's saying. And then they come back and interact with it and all that stuff. And all of those jobs just have to process.

In some order, they have to coordinate and they have to re-sync and everything. Wow. Yeah, exactly. That's one of the best stories or I guess use cases in... Data-driven applications? Yes, designing data-intensive applications. They were talked about when they had to split the architecture for really popular Twitter users versus regular Twitter users. And it's amazing how much work...

and how different each user was. Tim Banks on the show just a few months ago or whatever, was talking about that with the hot girl problem, right? Well, he was talking about the Tinder version, which is wild because we think of it's even just funnier because people use travel mode so much, you know what I mean? And just moving from place to place, who would have thought that would be a whole different reason that you had to change architecture? It's crazy the way that we've changed the way that we use the internet and interact with different applications and applications.

applications for different things and how that causes different architecture.

Am I the only one who like, I'll go download an app or I'll go to websites and like something slow or something acts weird. And I try to figure out what the architecture behind it is. Cause I feel like a lot of times I'm like, Oh, I was like, I bet this is like a sidekick problem, right? Like they're trying to scale up something like there's a cue. Trying to diagnose random apps. I totally do it. I'm just like, I bet this is like a queuing problem somewhere that someone is like, Oh, you just didn't scale up this portion of it or, you know, retries or something like that.

Yeah, let's see. So other stuff in the architecture. So Mastodon Web is a Puma application. So in our ecosystem, we actually we only have one instance of that. And again, that's another little mini project that we've got is to build some redundancy in there and a scale point there. And then so in like the last edge for us is our CD, what we call our CDN edge.

but something we're hosting in Linode or Linode, I guess, depending on how you like to say that. And we've got points of presence across the world, you know, just for caching, for speed. Linode got bought by Akamai, right? They did, yes, yeah. So I'm still stuck on the old name. I'm pretty sure the branding still says they have Linode by Akamai or something like that, yeah. Yeah, so we have those points of presence, so like kind of West, East US, Europe,

Japan, and occasionally light those up. At what point did you feel like you had to have a CDN? So actually, when we did the move from Alice from the basement into kind of our current architecture, we implemented it then, right? And actually, a big part of it was a lot of the traffic involved in Mastodon is media files, right? So images, movies, audio, whatever, right? And the stress that it put on just like the core server,

you know, even with caching to provide that better experience. Like we were like, okay, we've got to do CDN. We have to have something that's closer to our folks. And it's distributing that cache to take the load off the core server. So that was already in the architecture there. If I remember correctly, when it was on Alice, the limiting, there was NFS for that storage, right? Oh yes. Yeah. It was NFS on the backend that was storing that. Where is that stored now? Cause that's a lot of just data. Cause it's on a CDN, but you got like

what hundreds of gigs maybe that are coming in and oh uh terabytes easily right so um so that's all in digital ocean now so we're using their spaces product that's right that's right okay s3 uh compatible api not to distract from like where this is going but like how do you guys fund this that is a super great question um so so we were super fortunate in that when we when you know

throughout the lifetime of Hackaderm, right? We are part of what we call the Nivenly Foundation, which is an open source nonprofit foundation that Chris founded, again, about two years ago. I'm a member of the board there. And that's kind of like the umbrella organization. And we have folks who are kind enough to contribute to us, believe in the mission. And part of those contributions go to Hackaderm as well. So we have people who are directly contributing to Hackaderm. We have people who are contributing to Nivenly.

And that's where that comes from. What does Nivenly do outside of Hackaderm? Yeah, totally. So Nivenly, we're all about, and again, this is a lot of Chris's vision, was building a foundation that can help open source projects really identify what we call pattern of patterns. So how do we start new open source projects? How do we fund them? How do we help them self-govern, right? So that the contributors, the maintainers of those open source projects can retain control.

and direction and vision for their individual projects. So really about teaching and finding different unique ways that we can support them. So again, open source contributors can go do the open source thing. And the dream would really be able to support them in such a way that maybe they don't have to have their normal day job. It's really interesting.

Yeah, totally. And we've got a couple of projects under that right now. So Hackaderm being one of them. I know Nivenly's been around for a little while and Chris and I were talking about it like back in the day. Are you open about the like,

funding for that? Because I know there's donation side of it. And some of it is like actual, like I pay dollars for it. And there's some of like sponsorship, like DigitalOcean is sponsoring the object storage for HackyDerm. Do you break out the like where the spend goes? Like, is there like a line that says like HackyDerm costs us this much money?

Yeah. So funny enough, I'm the treasurer for Niven League. Great. So yes, so we do track all that on the backside. We have not, or I will actually even say, I have not done a good job of publishing that regularly. But our intent is to publish that on a quarterly basis and be fully transparent about, you know, what are the types of in-kind donations are we getting? What are the types of membership donations that we're getting? And then, yeah, where do those costs break down?

So how much does Hackaderm cost? So Hackaderm right now probably runs us just over $600 a month. So the vast majority of that spend goes towards media storage. Well, that media storage, the CDN portion, right? Because the DigitalOcean side is sponsored, right? It actually isn't. So I think right now we're getting like a little bit of a discount. So yeah, DigitalOcean, if you're listening, like sponsor, you know, we'd be happy to talk. But yeah, no, that all comes out of Hackaderm's funding. Yeah.

So did you guys buy the hardware up front with donations? Yeah, so we're actually leasing that through Hetzer. So they have actually... Hetzer has incredibly reasonable pricing, both for metal and their cloud products. Who'd have thought? Yeah, we leased a few servers from them. And yeah, that's how we're operating today. So we actually don't own any of the hardware at this point. I didn't even know that they had a cloud. Like I knew that they did...

leasing for bare metal, but that's really interesting. Yeah, totally. And that's actually one of the interesting quirks or maybe features of Hackaderm, right? As we talked through the stack there, all these pieces are in different clouds, right? We explicitly chose not to go with one of the larger clouds and put

all our eggs in one basket. With the idea being that, because again, the Mastodon footprint is relatively straightforward and simple. Simple is probably the wrong word, but like... Simple if you're familiar with Rails, right? Exactly. So there's effectively one application that we're hosting. We said, hey, we're going to go multi-cloud. So in the event that we have an issue with one of them, we can easily flip that piece that's being hosted there elsewhere. Right.

What's up, friends? I'm here with a good friend of mine, Adam Jacob, co-founder and CEO of System Initiative, and

And I'm pretty excited to have him here because that means System Initiative is out there. It's GA. Adam, I heard that you launched something. Yeah. Oh, I'm stoked. We did. Yeah, we launched something on the 25th of September. And yeah, you can use System Initiative now by going to a website and signing up and three clicks and you're in. And then you can automate infrastructure. It's sick. It's the coolest thing in the universe. I'm so proud of it.

Well, let's level some folks up. Let's level up the Terraform folks, the Pulumi folks, the AWS CDK folks. As of system initiative being GA, these folks are kind of doing things the old way, right? Yeah, I mean, that's what I hope is true. Okay.

I think, look, here's what it is. Basically, we figured out that part of the reason that it's so hard for us to achieve the outcomes we're looking to achieve with the kind of DevOps and operational work that we do is because the tools we're using sort of help bring about those tough outcomes. It's a lot harder to like,

write static code, have your friends review it. In System Initiative, what you do is you use this like living architecture diagram to put together all the different relationships between the things that you use. And then you can program that architecture diagram to do all the stuff you need it to do. So like it automatically understands how to do things like, you know, create resources and delete them or update their tags or do those things.

But then you can also extend it with your own custom policy. And the whole thing happens in real time in multiplayer. Let's say you're going to like build some infrastructure. You've got to go, you know, use an AWS account. You're going to launch a new service. So you've got to go set up all the different pieces, the VPCs and the

EKS clusters and, you know, ECS and database services. And you got to set up IAM rules. There's all this stuff you got to do. With system initiative, what will happen is you'll sign up, you'll get this workspace, and then you'll have this list of all the different architecture assets that AWS provides. And what you'll do is throw those things into this big diagram in the center of the screen, which is basically this living architecture diagram. And then you'll connect them together, just like you would if you were drawing an architecture.

And what it's doing when you do that is actually writing the code to describe how these things work. And it's running it as a simulation. So it's telling you in real time, this would work or this wouldn't. So you don't have to wait. There's no like long feedback loops. We actually vet all of that infrastructure and all that architecture in advance. Then you can say, hey, this looks good. It's what I want to see in the real world. And you can apply that change set. And it's keeping track of all the different things you have to do to actually go make that infrastructure real.

And then it goes and does it. And then after it does, it keeps track of those things too. So you can see both sides. You can see the real thing in the world that is what you created, and it's attached to the model of what you thought you wanted. And then you can use that to manage it over time. And then when you have customizations or tweaks or things you need to build for yourself, you can go write that directly into the system in real time in these same kind of change sets that you use to do the infrastructure.

And so that's what it's like to use System Initiative. It's the most powerful, intuitive, collaborative way to do this work that's ever existed. Okay, System Initiative is out there. It is GA and it's the future. Go to systeminit.com. Get started in three clicks. They do have a free tier that means free. No credit card required that you can play with. Again, systeminit.com. That's S-Y-S-T-E-M-I-N-I-T.com.

How big is the team that manages Hecator? It's a rotating cast of folks. We've got some core folks, probably, I would say probably like four or five core folks who are around and then, you know, volunteers who kind of come and go.

So and what's nice about the infrastructure that we've set up so far is that it's I mean, it's relatively stable, right? Like probably like the main stressors that would cause it to become metastable would be maybe like another Twitter exodus, right? Like another big surge of growth. But even with that, we've kind of planned for the future. We have got spare capacity.

and kind of like the key scale points right now. So yeah, largely the software just kind of runs, right? And like the big activities end up being stuff like major upgrades, right? So version bumps from Mastodon Upstream. So we just did 4.3.0 last week, which is probably our biggest upgrade that we've done in a while.

And then we did 4.3.1, which was basically none of it because it was super simple. But generally, the hard upgrades are upgrades that involve database migrations. Those are always the hard ones. Exactly. It's just because you're touching the real data. And there's always a little bit of fear there, but we've got some good mitigations in case. Is it still Rake?

Yeah. So it's just Core Rails stuff, right? Break migrate DB. OK, cool. And there you go. Has the Redis licensing affected the way that Hackatube runs at all? So it hasn't. But we have had conversations about moving to something that is truly open source. And kind of in the same vein, Terraform going to the VSL license. We've talked about OpenTOFU. Actually, at this point, we've been-- I forget which version of Terraform it is.

But we pinned the very last version of Terraform that was the open source license. And then again, at some point, another mini project could be, let's go open Tofu.

You have multiple clouds, which Terraform is just great. It's just like, hey, we'll make this look like one thing, basically. But then you only have a basic, a few servers in Hetzner. And how are you actually deploying that? Is it like Terraform, create my server, manage my server, and then like Ansible provision the application? Is it like a Kubernetes cluster on those servers? What are you doing? So yeah, so less glorious than that. So we actually, again, another key infrastructure decision that we made pretty early on was no Kubernetes.

Because we wanted to avoid really the complexity that that introduces. And again, don't get me wrong, I use Kubernetes all day at day job. But for this, we made a key decision where like, hey, it adds layers of indirection, especially when we think about the network and

and the abstractions that you have to deal with through Kube API. And we said, hey, we're going to treat this just like a simple application, right? So, and right now I will be super honest, deployments are pretty manual, right? So it's SSH and go do the thing, right? You know, we've done some light Ansible work around provisioning, you know, kind of the post-provisioning after a host boots. But yeah, for the most part, it's actually manual, right? So with like the core infra around the hosts themselves being terraformed.

Yeah. You're like, I get an IP address and I'm going to go do the thing as an admin that I know how to do. I think that's like a lot of people over, overemphasize everything needs to be automated. I'm like, no, I was like, I've run a lot of infrastructure in a lot of places and having manual steps is actually really important for some things. Yeah, totally. And like there, there are certain run books for activities that happen a lot. Right. So again, Mastodon upgrade, there's a standard script that we run every single time we do a Mastodon upgrade and it's relatively like, it's not that it's unattended. It's just,

very automated and a human watches it and when it's done, it's done. And yeah, maybe an upgrade takes, again, without a database migration, maybe an upgrade takes 30 minutes.

Even when we did the 4.3.0 upgrade, I think that we did that over the course of maybe two hours. And again, largely uneventful, pretty straightforward. It just involved a database migration that took a little bit longer. Do you guys have a team of people that get paged for Mastodon? We actually don't. So the way that we've got it set up right now is we've got an uptime robot that will watch, kind of do our synthetic checks to our public facing properties. That'll alert into Discord.

We've got folks kind of around the world and we've got an infrastructure channel. If somebody sees something, they will go do something. Right. So so

So nothing as fancy as actually getting paged. I'm on the infrastructure team for Southern California Linux Expo. And we have a Slack channel that has a data dog that's like, hey, guess what? The site's out. Someone's like, oh, I'm on. I'll check it. Yeah, absolutely. Yeah. And then to kind of continue the observability stack, again, another mini project coming up here in a second. So we've got Grafana. There's a public site to Grafana. So if you actually want to look at our public dashboards, you just go to grafana.hackaderm.io.

You can see the public dashboard that kind of has like the key vitals, right? Not quite at the level of like SLO, more like an SLI where we're looking at like very specific health attributes of the different services that we provide. That requires a login.

Oh, I might have to send you the public dashboard specifically. I was going to put it in the notes. I'll send it to you afterward. But yeah, you can look at the key metrics. And again, like we talked about sidekick earlier, that's probably one of the main things that we watch is how hot are the sidekick queues? Are they getting super deep? Are they getting backed up for some reason? What if it gets stuck and you have to kick it?

You gotta do something, right? So Grafana kind of has that main visualization layer backed by Prometheus for metrics. We're manually registering servers there for scrapes and things like that. Not too long ago, we set up Loki and Promtail to scrape logs and ship those centrals, so they're available in Grafana as well. And...

What else from an observer? Oh, the mini project that's coming up is in Mastodon 4.3.0, there's finally support for OpenTelemetry. So the next little mini project is get OTEL collector split up and start presenting that, which is actually going to be pretty important because as of 4.3.0, they also deprecated the stats D project.

I guess, commission. We just had Austin, Austin Parker on the show who's on the hotel board, which was a really fun conversation. So yeah. Awesome. Super cool. Yeah. So that's, that's going to be a fun little mini project that that might be something I'd like to try to do this weekend or something. So yeah, totally.

You said you've been going through these waves of like people leaving everything and like you went from that 500 to over 30,000. How many active users are on it today? So I think the last time I checked monthly active users was somewhere around 11 to 12,000. So, you know, that's still a lot of folks. So I think total accounts was something like 55,000. That's on the public dashboard as well. I think monthly active was like 11 or so. Cool. Yep.

It's been running okay. You've scaled up the things. The red SQs and the PG bouncers, are there other areas that you think this is the next thing that's going to fall over? If you get to 100,000 users or 30,000 active monthly, what's the next thing that's going to break? Yeah, totally. I think insofar as just scaling, it's probably Sidekick, where there's probably good opportunity for us to do...

a bit better about maybe like reactive uh auto scaling even uh with sidekick where it's like hey we have a predefined threshold where we go spin up another maybe machine that goes and runs x number of sidekick cues right or replicas to go handle load when whenever we're uh

a certain percentage behind or a certain percentage deep in the queues. I mean, that generally tends to be the thing that gets stressed the most. That's where we see things get hot. The web side, you know, I mentioned earlier, there's a project to go, you know, put some redundancy there. That's more from like a redundancy and a failure mitigation perspective and less from like a

like a scale or a stress point. It actually tends to do pretty okay. And then, I mean, I'm always afraid of the database, right? Just because the data, the data is the data, right? So I think there's some good opportunities there. Again, less from like a scaling perspective because Postgres does like a freaking awesome job of just like

running so well. So it'd be more, again, from like a read replica perspective, maybe offload a little bit from the primary, but more from like a, hey, if the big database goes down, do we have a place to flip quickly without having to restore from backup? That sort of thing. Yeah. How many backups do you have currently? So right now we do a weekly full backup. I think we're keeping those for, I'll have to go double check how long we're keeping the full backup for three weeks at a time, 21 days. And then

And then we do a daily incremental diff. So we're actually in a pretty good position from a backup perspective. That is actually really good, especially for this, how small your team is and the money you're spending per month. Exactly. Well, and like for us too, it's like we can have that small team and, you know, have some solace in the fact that like, hey, if something really, really bad happened, like let's say all the disks for whatever failed on like that primary.

we can get back and, you know, have a reasonable, you know, kind of like RPO, RTO, right? I think it's also because you have like a very talented team. I don't know if it could be run the same with the amount of people you have if you guys didn't have so much experience. Absolutely. Like, yeah, don't get me wrong. Like the folks that we had, you know, building the initial team

architecture to get us here that actually executed on it. The folks that have been helping us over the years, even the folks who have just showed up for like a month or two, done a little bit and then gone and done the next thing. Like everybody's been amazing, right? And we absolutely wouldn't be here. We wouldn't be able to run this without those folks. Well, you're one of the talented people too.

I try to help, you know. So, but yeah, but, you know, just think about other scale points again, like I just always come back to Sidekick, right? Because that is, again, as the Fediverse graph grows, that's the thing that gets really, really stressed. Has Sidekick and Database has been the most painful part of your infrastructure? Yes.

Like, yes, period. Yeah. I mean, like Redis is pretty commonly known and just like kind of works. And then Puma is like, yeah, no, Puma is fine. It's there. Yeah, exactly. Yeah. Like generally, like when I think back to like, you know, incidents that we've had or challenges that we've had or like the scary moments that we've had, it's generally been, yeah, sidekick queues are way backed up. What's up?

And then really figuring out the tuning. Hazel Weekly did a lot of really great work around tuning of those queues, parallelism, number of queues, where they're actually deployed, that sort of stuff. And then the database. The Postgres. Yeah, for Postgres. And then it was essentially just, let's give it a lot of power, a lot of CPU, a lot of memory, fast disks, and let's over-provision everything.

So that we know it's ready for it. Which is interesting too, because like that, the vertical scaling side of it's when you're using a metal server is just so much more cost efficient than like, oh, let me scale up this EC2 instance, which is like my cost is going to go up four times. But that's hard with relational databases because they take so much like time.

We'll take what you give them. You know what I mean? Yeah, exactly. So it's like everybody's like, we can just do everything on Postgres, which Postgres is a beast. But I think we're also going to learn why we have non-relational databases or no SQL databases because everyone's like, just put everything in it. And I'm like, I think you forgot the scaling problems that people have on a relational database. But Postgres is like the GOAT.

And everything runs on it. Yeah, exactly. Yeah. And like the other thing that we have to keep in mind is like, you know, because they're metal servers, if we want another one, right. If we say we're going to go add another server, that is a process that takes a week.

That's actually fast, though. Yeah, don't get me wrong. That's fast from like, like Hetzer has been very responsive, very reactive from that perspective. But it's not, hey, I go to EC2 and go do a run instances. And suddenly I've got another machine in a couple of minutes, right? Yeah, exactly. So there's, we've got to have, you know, some pre planning and foresight there. And again, that's, it was just like, hey, let's get a big box. Yeah.

You mentioned Hazel and we're going to have her on the show in a couple of weeks, but she and I were talking about just like the progression of that base infrastructure. And I thought one of the most fascinating things was when it was running off of Alice,

one of the things that Chris kept getting calls from her ISP. Like you're doing a lot of traffic, right? You need to get a business plan, right? And so like that was part of the motivator of like, we actually, we can't run this big social network from someone's house. Yeah, exactly. No, that was a big part of it. Transferring it to like a foundation as a part, again, as a part of just being

Innova's abasement was a big part of it as well. And then also kind of starting to even think about things like data residency, right? And just thinking about the different legal jurisdictions that we operate in where we have folks who have data. And it was like, okay, hey, we're going to move to the EU very specifically as a very specific choice.

And Germany is a very specific choice, right? Because of the protections offered there and the strength of those laws. I remember she posted a picture of Alice like in her basement and it was so cool. It's like you almost forget how much servers do for social media because it's

just everything lives in your phone in the cloud and to see a physical server and everybody like being connected on it was so cool. And the fact that that one physical server from someone's house could handle 500 ish people on like, here's my server. Like this is, this is fine. Like before the ISPs come knocking at the door before you need anything like every,

like scaling out and you don't need PG bouncer and all that stuff. You're like, I have enough disc capacity with some manifest maybe to do this. It's not, maybe not amazing, but it works. And you, people underestimate, like, I want to go start a blog. I want to go do something like just go run it from your website or from your house. Like you, you have a public IP address. Like you can like DNS that and like, you can even rotate it, all that stuff. And you can just do it from your house and start without needing to like go pay for all this other stuff.

Yeah, absolutely. And one thing the Mastodon project has done really well, I know I said we don't use Kubernetes or containerized Mastodon. They've done an excellent job of building this little micro ecosystem where, yeah, if you run Kubernetes, you can very easily just say, I want a Mastodon and you go apply the deployment and you've got one, right? Which is super cool from an approachability, low barrier to entry perspective.

As long as you know Kubernetes. As long as you know Kubernetes. The bar is actually kind of high. But also the same with Rails, right? Like if you know how to run a Rails app, then you're like, oh, this is all familiar. Like that was the whole point of Rails is just to make it familiar to everyone. If you go from one Rails app to the other, like guess what? This is going to scale the same way. And it's going to be very familiar to debug it and to see what's going on.

Okay, but why do they use N so many times? Why do they use what so many times? For Ruby. When they do like N, N, N, N, N instead of the brackets. Like it drives me nuts.

Like, I just want to know like who did this and why are they curly braces? Yes. Give me curly braces. Cause it just, you like it, you close the braces like end. It's like you have to read it with a counting ends. Like I've spent so many hours with a missing end. Like I have flashbacks. Yes. Yeah. I am. I am not a Ruby person, but I've learned enough to, to, to be dangerous and to at least keep Bastion running. So.

I mean, I'm just, I would rather tune sidekick than try to like dive into the JVM again. Right? Like the two big apps I've done in my life are JVMs or Ruby. And I'm like, nah, like give me, give me Ruby. I'll take the brackets and the structure. Cause I hate unstructured. You like sentence long functions. It's cool. You want to factory. But you know what it's good, but we know like specifically what we're asking it to do. Okay. So what were some of the laws that made you pick the EU? Like,

what in Germany specifically like what were some of the things that motivated you yeah absolutely so GDPR obviously is like a is like a is probably like the big prevailing one right and again like

The type of data we have is really interesting, right? Because it's in a lot of ways, it's, I'm not going to say it's PII in the sense that it's like, you know, personal identifiable information. Sometimes it is. But it could be, right? Like people could put interesting PII out there. But it's information that is very personal to the people who are posting it, right?

Again, the intent was for us to be in a place where we could feel safe that maybe the prevailing government or whatever wouldn't show up and be like, "Yo, Hackaderm, you're hosting this type of content. We're going to shut you down." Mastodon instances have gone down for that. That's not even a theoretical, "Oh no, this maybe is happening."

I was fascinated talking to Hazel about it where it's like the, one of the reasons that it's under Nivenly as a foundation is to protect the people that are doing the work, right? Cause there's like, Oh, you have, you're the web admin on this thing. You're coming to court. We're coming to you. Right. Yeah, exactly. Especially in like a world where it's like, you know, DMCA takedown notices, you know, like think of just all the different legal ways that people can go after. Can you describe why that's specifically an issue for something like Mastodon or

where it's like, if I run a website, that's not a problem. Sure. Like DMCA. Yeah. No, I mean like, like just like the content, how, why is other people posting content on other servers a problem for you? Sure. Yeah. So part of it is, you know, people can post whatever, right? So, and again, depending on the location of the server, depending on the authorities, the prevailing government though, and the type of content that's being posted, you know, we could be

be put in the crosshairs where the government will be like, hey, you're hosting this content that in our jurisdiction is illegal or offensive or whatever. You need to take care of it. Or we may block your site or, hey, we may go after your company or the individual, the officers of your company, because again, you're potentially doing something illegal in our jurisdiction.

So finding kind of that sweet spot and again like Germany was it when we did the move was super important for us. Seems like you've put a lot of care and thought into it, which is cool. Yeah, absolutely. I mean like the intent was, you know, it started in Chris's basement. The intent is for this to be a thing that's around for a really long time. Right?

Right. And again, like that's where the technical architecture, the legal architecture, even though we're not lawyers, you know, we spent that energy putting all that thought in there because we wanted to be around, you know, in five, 10, however many years.

Okay, friends, I'm here with a new friend of ours over at Timescale, Avthar Suwathan. So, Avthar, help me understand what exactly is Timescale? So, Timescale is a Postgres company. We build tools in the cloud and in the open source ecosystem that allow developers to do more with Postgres. So, using it for things like time series analytics and more recently, AI applications like RAG and search and agents. Okay, if our listeners were trying to get started with...

Postgres, timescale, AI application development. What would you tell them? What's a good roadmap? - If you're a developer out there, you're either getting tasked with building an AI application or you're interested in just seeing all the innovation going on in the space and wanna get involved yourself. And the good news is that any developer today can become an AI engineer

using tools that they already know and love. And so the work that we've been doing at Timescale with the PGAI project is allowing developers to build AI applications with the tools and with the database that they already know, and that being Postgres. What this means is that you can actually level up your career, you can build new interesting projects, you can

add more skills without learning a whole new set of technologies. And the best part is it's all open source, both PGAI and PG Vector Scale are open source. You can go and spin it up on your local machine via Docker, follow one of the tutorials on the Timescale blog, build these cutting edge applications like RAG and search without having to learn 10 different new technologies and just using Postgres in the SQL query language that you will probably already know and are familiar with.

So, yeah, that's it. Get started today. It's a PGAI project and just go to any of the timescale GitHub repos, either the PGAI one or the PGA vector scale one and follow one of the tutorials to get started with becoming an AI engineer just using Postgres. OK, just use Postgres and just use Postgres to get started with AI development, build RAG.

search AI agents, and it's all open source. Go to timescale.com slash AI play with PG AI play with PG vector scale, all locally on your desktop. It's open source. Once again, timescale.com slash AI. How do you go about, uh,

Because like Macedon also caches from other servers, like I follow some from another server, and I'm just going to get random content from that other server, even if I'm not following the person. And again, it's not even just your users, like your users, you could kick out, right? Like someone like he keeps posting something that's illegal. You're like, actually, you can't do that here, go somewhere else. And you can just ban them, basically.

But other people posting things on other servers, you can't ban the person over there. Is the only recourse to like block all access to that other server? Yeah. So that's actually, so I'll, I'll say it two different ways. So it's something that Macedon is doing well.

but can continue to do better, right? Because they've actually started to build some good moderation tools where, like you said, locally you can absolutely say like, hey, this member, they're banned because they posted whatever they've offended the server rules or whatever. But you can actually also do that for people who are not on your server, right? So if somebody from Mastodon.social is spamming, let's say, right, which actually happens very, very often, you can individually block that account.

right, even though they're on a remote server. If you find that the server as a whole, let's say like there's an unmoderated server or a server with questionable content or positions that don't agree with your kind of code of conduct, you can actually de-federate from that server entirely.

right so that means anything any anyone posting on that any activity that's happening on there is no longer relayed through your server and honestly like i like you know as much as we're talking about the infra here to me the the real magic of hackaderm is the moderation team and that's i mean every social network is that the moderation like that's what the thing that is important exactly like ultimately like the the network and the experience for your the people on your network is only as good as your moderation team and that's

That's something where our folks, again, spent a lot of time and thought and energy and just think about, okay, what

what does it mean to moderate? What does the moderation experience look like? What should that look like for our, for folks who are on Hackaderm or even off Hackaderm, right? Because there's always the potential that somebody joins Hackaderm and starts spamming from us, right? How are we on top of that? You know, how are we, how are we treating people, you know, with respect, but at the same time, making sure that we've got good boundaries so that we're keeping a healthy community, you know, for our folks. Is there, I mean, because you mentioned that like,

relay blocking thing that immediately to me sounds like what I would do with a pie hole, right? Where I'm like, I got this big host file where I'm just like, no, you just, you don't reach any of that. Is that like a subscription? Like, is there, is there known Mastodon servers that you subscribe to a black hole list of we're not allowing that here? Yeah, totally. And I, so I think this is like a big, this is a big evolution point for Mastodon potential, like the Fediverse as a whole is actually figuring that out.

So there are a number of services or organizations that do provide things like that. Where you can say, okay, I want to subscribe into a defederation list. And again, this is actually a really big active debate that's going on in the community right now. Should they be allow-less? Should they be deny lists that you're subscribing to? How do the subscriptions work both on a technical or a social level? So yes, you can.

Right. And again, depending on how your moderators like to run your servers, some people do, some people don't. And right now, it's not something that's built into like the core, like subscribing to defederation listens. It's something that's built into Mastodon core right now.

Not at ActivityPub. Yeah, correct. Yeah. So like Mastodon is like the upstream Mastodon project. Yeah. Because I know other Mastodon servers that don't have those features and they don't even do relays, right? Like it's like a super lightweight, like I can run this for myself. I don't do a lot of the things that are really heavy about a Mastodon server. I don't do it. It's fine. And you don't have to do those things. But the actual like subscription, let's make sure that this works across services is mainly just for the upstream Mastodon server.

I don't know if it's, I guess it's official. It's the one at Macedon's not social. Yeah. How does this relate to threads? Right? Like it, like threads is, is peering with activity, pub tumblers starting to do that. Are you seeing any uptick in traffic or,

new places you have to scale or block or moderate because there's these, Facebook has a lot more infrastructure than you do. And they can just throw more money and more service at the problem to say like, well, you know what? Like Sidekick, you just get a whole data center. I don't care. It doesn't matter to us. How are you, like, is that causing extra stress on your instances?

Yeah. So far, um, no right now, that being said, you know, the, you know, like particularly like with threads and again, I'm, I'm actually not super familiar with like their internal infrastructure, like how they're set up, but you know, they're starting to engage in limited federation. They're very piecemeal about how they're like, like we want this feature from this thing and make you feel warm and fuzzy that we have active activity. Exactly. Exactly. Yeah. And so like super early on when, when threads first announced like, Hey, we're going to federate into the Fediverse, you know, again, uh,

Big conversation, big controversy about should servers even allow that?

allow that, right? Is it okay for meta to start to push things into the Fediverse? So big discussion there. From a technical perspective, yeah, just like you were saying, Justin, the concern is, okay, well, what if they turn it on for literally everybody? You would just blow it out of the water. And it just destroys, yeah, exactly, the Fediverse. And some people thinking like, hey, that's the plan, right? That's how they're going to destroy the Fediverse and take it over, right? But yeah, to this point so far, again, because I think

they've taken kind of this limited measured approach to Federation. We haven't seen anything, you know, from like a sidekick perspective or, you know, just a general scale perspective. And that's, that's broken anything at this point. Now it's like going to change the future. I don't know, but so far so good. I heard a recent podcast with whoever's running threads over, over there. And he was, they were asking like, are you ever going to turn on by defaults?

interaction. He's like, absolutely not. Like we're never enabling that by default. And immediately you're like, okay, cool. We're never going to get more than like, we're never gonna get double digit percentage of threads users to like go find the obscure setting to say like, yeah, federate all my stuff across all of the activity pub all the time. Like it's just not going to happen because it's not going to be there by default and they have no incentive to do that.

because they're the ones selling the ads. I don't think most people even know that it exists. Right. And that's the point, right? Like there's just like, it's a, it's a feature to get the nerds on board. Exactly. Yep. Yeah. Yeah. And like you mentioned ads, like that was all, that was a big concern again with, with the federation model is like, okay, well, is meta going to start trying to push ads to,

through Federation and then I'm on Hackaderm and I see an ad for something. So again, to this point, haven't seen that. Obviously, as the situation evolves, we might change our stance. I never really thought about how that might work because by default, it's only users that are doing the inbox outbox sort of handshake between instances.

But is there other metadata that gets sent that they could try to like push content to you to say like this is yeah so like depending on the scope that you're looking at so like normally when you log into like the master on you that you're seeing your kind of feed right it's like okay who are the people in hashtags that I'm looking for listening to or I'm watching you can actually change scopes to look at like what is happening in the Fediverse right and for all the things that your servers relayed to so like in that case like they could push a post and

They could have a user, a quote-unquote user, that could push a post that's an ad. It happens to come from threads.net. And it shows up in kind of that global Fediverse view. So that could be one way. You just have to trust them at that point too because they could say, this post has a million likes and you have no way to verify or do anything around like, oh yeah, obviously that's the most popular thing in the world right now. We should show it to more people. Yeah. And with the way that MassStorm works right now, there is no algorithms.

in the sense that things aren't weighted. There are things like trending hashtags that might show, but in terms of your feed and your posts, it's linear. It's first in, first out. But yeah, if they say it has a billion likes, that's what the ActivityPub message said, so it must. And I guess they could also just send you a bunch of

that have a hashtag to get that in trending, right? Just like, oh, even if you're not showing this to people, here's a bunch of stuff and you have no way to like verify it. You don't have any way because it's their infrastructure sending stuff to you and you're just like, all right. Yeah, that's the interesting thing about the federation model, right? Or just really just the federation in general, right? It's like in the end, it becomes a web of trust, right? And it's like,

Do you trust these other servers to send you things that you want your people to see? It's a web of trust and it's affecting your site, which is the weird thing, right? Because like my blog, my web ring, I get back all of it. My web ring, like I trusted those people because I was reading their content for a while, but anything they posted didn't actually affect me.

Right. Like it didn't show up on my site. And I like, I haven't had comments on my websites for a long, long time because I like, I don't want your stuff showing up here. Like if I want to hear your opinion, I'm going to go out to social media. I'm going to find the opinion out there, but I don't want it muddying the water of like, someone just read this.

even like YouTube comments, right? Like I'm glad they hide most of them by default now. Cause most of them are like, this is trash or whatever. It's just like, Oh, it's going to like taint someone else's view of like, did I just read something good? Did I watch something I liked? It doesn't matter. Cause someone else that was popular said this sucks. And like, okay, well maybe nine out of 10 comments say this sucks. So I'm the wrong one. It's like,

It doesn't actually matter. So like I turn that stuff off on my sites, but it's fascinating that you now have to trust some other website to send you something that you, you can't verify and it might affect how people see stuff that is not controlled by you, but is, is somehow aggregated and shown. Yeah, absolutely. Yeah. And that's where we're going to, it goes back to moderation, having good moderation practices, consistent moderation practices. Moderation is key in being able to block people.

Yeah, absolutely. And content absolutely does affect reputation. If we take too long to take action on something, or potentially even if we take what's perceived as the wrong action,

Right. People might say, oh, well, hackaderm moderation. Y'all, y'all aren't doing it right. And I do find that interesting that the Macedon model of moderation is the server, basically. Like there is individual block lists, but eventually you have to land on a community server that is well run, that you trust the moderators are doing things in your best interests. And then if you're only recourse, if that isn't the case, is you'll find another community that you align with a little better.

Mastodon doesn't have anything like a blue sky has block lists that are individually maintained. Like someone's like, Hey, these are harassers. These are people that I don't like. If I don't like a list of five people, you can subscribe to my block list and you can also not have those five people. But Mastodon doesn't have something like that at the individual level. Right? So at the individual level, I don't think so. I'd have to go double check.

I know there are scripts that will let people import them and do individual API requests to do blocks like that. Twitter had a similar thing. You could subscribe to a block

bought or whatever and it would automatically like it had access to your account by going in and saying like okay yeah i'm going to block all those things that this other person also blocked yeah yeah at this point like i know at the server level there's not the notion of a block list that you can just say hey i want to go use mastodons on socials block list or whatever like that that doesn't exist again there are multiple proposals for uh how to do that um that are kind of flying around right now um and then yeah at an individual level i don't think that exists again i'd have to to double check yeah

But to your point, like, yeah, if you don't, if you find yourself in a community that's not aligning to what you want, you can use the migration feature to move. But obviously that's a heavy toll, right? It's like, well, in migration, isn't bring your content, right? It's bring your, bring your metadata, bring your followers, bring your identity and your followers. And it's not even like bring your identity. It's like redirect your identity. It's like, Oh, Hey, redirect to over here. Right. Right. It's like, I equate it to kind of like, if I move my house,

my new house is somewhere else and I can tell the post office forward my mail for 30 days. And it's like, hey, keep sending me this mail if it comes here. It's actually for me. I don't want something. Please just do that for me. Do you see that happening a lot? Do you see people going between servers frequently? You know, it depends. I mean, like that's again, like the fun part about the Fediverse is like the graph is always changing. Servers are coming and going. So I would say in general, yes, like we do see that happening a lot.

And that was one of my fears of like these servers get shut down because they're either too expensive, legal reasons, people aren't interested in running them and whatever reason. And that's like my post office shutting down when I told them to forward my mail, right? Like now like that post office doesn't exist anymore. Nobody has that rule to forward your mail. Sorry, like you're just a new person. Yeah, exactly. It's just like, oh, hey, X site is offline. Like that domain name doesn't respond anymore, right? So there's nothing there.

Right. Immediately. I thought you meant like X, the site X.com. Oh yes. Oh, sorry. Is X down again? I was like, Oh yeah. Yes. Yeah. I meant that as generic domain. Yeah.

If X is down, you get more users on your side. Yes, I keep forgetting that that's the domain now. I still call it Twitter. I do too. I call it Twitter when I want to talk about features that I liked. That's awesome. X is like the derogatory term. It's scary to lose all your content on Mastodon.

There is an export feature that can give you your content, right? Like it gives you a zip similar to what, I mean, cause like people forget that like X has that and, and Facebook has that. And this came out from when, what was it called? Google plus. No, what was it? Yeah. Google plus the waves. That was called.

I forget. Yeah, like it was the social network. There was like this huge legal battle that like people had to allow you to export your content. And it was like all these big social networks had this, it was called takeout. That's what it's called. Google takeouts. They were like the driver of this because they're like, we want people to move to Google Plus. And so we're going to make this like legal battle to make Facebook do it. We're going to make Twitter do it. We're going to make

Instagram do it. And so you can go to all those sites and there's a legal reason, at least in the United States, I don't know about the rest of the world, but you can go download your contents and they give you a zip and like the Twitter export, right? If you export your data, you get like a little mini website and it shows you like your stuff, like all of the thing. And the idea was that you could then go somewhere else and you would import that.

into another social network. It's like, hey, here's all my stuff. Please import this for me like it was yours. And I know there was some projects that started up around that that would be like, oh, we're going to take your Twitter and put it in Tumblr. And we're going to do those sorts of things. But it never really caught on really large. And then Google got bored and they turned off Google Plus and all the stuff that normally happens. I might go export my Twitter feed. I deleted all of my Twitter and I kept an export. So I have an export. Can you mass delete

your Twitter without deleting? Not anymore. Cause the API limits, um, you used to be able to, so I did it before the API limits came out. And so you can pay for a service to mass delete your ex posts. Um, cause they're like, we will pay to get more API quota for you. Um, there's also some like free scripts you can run. That'll be like, Hey, delete my last 2000. And then you like the next week you have to run it again over the next day. Yeah. Now that I've had,

I've moved so much over to blue sky, I think I might just mass-elite my Twitter because just the shady things that that man is doing. Did you see the thing today?

about how he says he talks to Putin like every like all the time like just the amount of data that he has on all of us and yeah I deleted 16 years of content I was like I was on that site I really enjoyed it and I was just like nope not doing it like it's just I have to get out and I just don't want to contribute to the data and to the things that he's doing that are wrong especially after just seeing like

Yeah.

started to outpace Twitter, I was like, there's just no, it's icky. I don't want it anymore. Done. Let's go. Yeah. Like one thing that brings up like going back to like, you know, largest cost for Hackaderm is journalist storage. Right. And it's media storage. Right. So like one one thing we'll have to figure out probably sometime soon is, you know, what's the cost of infinite, you know, unbounded storage? Right. If we just keep everything forever, right.

we wouldn't be able to run Hackaderm, right? So at some point, you know, what does archiving look like? And then what's the user experience impact? It's like, okay, if I wanted to go export 16 years of Hackaderm history, do I lose the first 14 years of images and things that I posted? That's true. Yeah, that would make me sad. Yeah, exactly. So, but then at the same time, it's like, we can't keep that stuff forever because then, you know, we'll be paying $10,000 a year on S3 storage or whatever, you know? Yeah.

Yeah, that's fascinating. As you want it to live for a long, long time, and that storage isn't going to get less. It's not going to- Yeah, it's like, what's that balance, right? I mean, there's definitely a balance of how much do you keep in a CDN, but there's still a balance of that hot, warm, cold sort of storage tiers. Are there any other long-term risks, do you think? Oh.

Other long-term risk. I mean, that's from a customer experience perspective, I think that's, or a consumer experience perspective. I think that's probably the main one, right? It's just like, yeah, the cost of storage just keep going up. Other big stuff could be political jurisdiction, legal jurisdiction type stuff changing. That's always a thing. To me, those would be the big ones, right?

But I think insofar as like, just like core infrastructure, it's like, no, we can run this thing for a while. Always looking for volunteers. Again, like I said, like even if it's like you're going to rotate in for a couple of weeks and do, you know, little project here or there, you know, always happy to have folks come help out. I frequently recommend people that like, I want to get into technology. I want to do infrastructure. I don't have any experience. I'm like, go find an open source project like Hackaderm.

help out that absolutely like, like I sent people to the Fedora project, the Kubernetes project. I've sent people over to hackaderm. I'm like, if you want to know how this stuff runs, there are people in discord that will literally just sit with you, let you watch them do some work and then hand over the keys to, Hey, let's do this together. And then you're going to get like actual professional resume experience that you're like, I helped with this project scaling, whatever reason. So if you're trying to get into this space, you're,

there are open source projects. There are foundations like Individually that create and run infrastructure on the public internets that will be beneficial for you to understand, and you can get it for free. Just spend your time there. It's a great learning experience. Absolutely. Absolutely. So yeah, if you want to do Linux sysadmin work, Postgres work, Ruby on Rails scaling, distributed systems scaling like

Yeah, come on, y'all. I see it so often on LinkedIn, Reddit, wherever. People are like, how do I get into infrastructure with no experience? I'm like, go find some experience. You can do it at your home lab. That's fine. But go open source projects are asking for help. Absolutely. And you can go help out. You can go say, I'm going to show up here once a week, twice a week, whatever, whatever time you have. And they're like, great, here's some docs. Let us know where there's problems in the docs. Even that's helpful in so many cases. Yeah, even just documentation work.

If I move out of engineering, I think I will definitely join one of the Kubernetes release teams just to not lose experience doing release and infrastructure. Yeah.

And the Kubernetes, I was on a release team before and it's a lot. There's so many like sub projects that roll up into this thing. Yeah, it's a lot of fun. So Preston, thank you so much for coming on the show. Thank you for talking to us all about Hackaderm. If people want to find you, what's your Hackaderm handle? Where should they? So I am esk at hackaderm.io. I don't post a whole heck of a lot. And when I do, it's usually like synthesizer stuff.

So no one can see your background right now, but I was really, I was looking at all those synthesizers and wires back there. I'm like, that looks like fun. Like I don't need another hobby. Yeah. Yeah. It's a, it's a fun hobby. It's, it's kind of like my medium in between doing like, I guess like real world engineering where I get to work with like electricity and stuff like that. And kind of blends it with creativity because you get to sometimes create music, usually create noise.

It's basically engineering too. Yeah, there you go. Exactly. Exactly. No difference. Described all my git commits. This is fine. Yeah, exactly. Exactly. At least gets me off a computer screen. How about that? A different keyboard. Different keyboard. Yeah. I don't have a mechanical keyboard. Mine has, if you actually like piano, like I got wires in mine. This is exactly, exactly. All right. Thank you so much for coming on the show and we'll talk to you again soon. Absolutely. Thank you so much. Thank you.

Thanks for listening to Ship It with Justin Garrison and Autumn Nash. If you haven't checked out our ChangeLog newsletter, do yourself a favor and head to changelog.com slash news. There you'll find 29 reasons, yes, 29 reasons why you should subscribe. I'll tell you reason number 17, you might actually start looking forward to it.

Sounds like somebody's got a case of the Mondays. 28 more reasons are waiting for you at changelog.com slash news.

Thanks again to our partners at Fly.io. Over 3 million apps have launched on Fly, and you can too in five minutes or less. Learn more at Fly.io. And thanks, of course, to our beat-freaking residents. We couldn't bump the best beats in the biz without Breakmaster Cylinder. That's all for now, but come back next week when we continue discussing everything that happens after Git Push.

Bye.

Hosting Hachyderm 01:10:34 Share