FTWynn, to random

If you aren't keeping track of these 4 usage areas in your Observability tooling, you'll never be able to optimize its value.

#observability #o11y #monitoring

adrianamvillela, to random
@adrianamvillela@hachyderm.io avatar

What happens when you're an Observability vendor migrating to @opentelemetry? @jea knows exactly what that's like, as he shares the story of how he worked on migrating to OpenTelemetry at ServiceNow Cloud Observability (formerly Lightstep).

📺: https://youtu.be/pHHINe9D94w

asmodai, to cisco
@asmodai@mastodon.social avatar
mdepalol, to random
@mdepalol@discuss.systems avatar

I've just finished reading the "Observability Engineering" book by O'Reilly.

It's a good book. I must admit I've learnt a lot about , even if I already had a good understanding of the subject.

I've especially enjoyed the data storage chapters and some gems about the "cultural" aspects of observability.

My most important take though is the concept of the Error Budget (https://www.blameless.com/blog/error-budget), definitely I'm going to put that into good practice soon.

That said, while the book is great I feel that it's too long and I think that the authors could have taken a more pragmatic approach to writing some of the chapters. I think there are lots of "repetitions".

Easier said that done of course.

FTWynn, to random

Outside of "how much" and "where is all of it," what should you talk to your users about re: their #o11y data needs?

Workflows?
Tooling gaps?
Metrics to improve?
Platform feature requests?
Current toil that feels unnecessary?
What other data should you bring to the discussion?

#observability #monitoring

FTWynn, to random

The most important factor in getting your logs under control is routing them to the right place, /dev/null included. If you're trying to optimize log costs in a system that's already charged you dollars per gig on ingress, you've already lost the battle.

#observability #o11y #monitoring

FTWynn, to random

Teams should have a regular review to determine what of their #Observability data is actually being used. Otherwise, "just in case" becomes a value-less justification with uncapped costs.

#o11y #monitoring

FTWynn, to random

Saving money on #Observability tooling is incredibly simple. Turn off all the tooling. Maximum savings instantly achieved.

But if you wanted something short of that extreme, you'll need a coherent #o11y framework, an understanding of your business, and some judgment.

#monitoring

adrianamvillela, to random
@adrianamvillela@hachyderm.io avatar

Q&A TODAY!!✨ @hazelweakly joins us this week to share some gold nuggets on her personal experiences with at this week's OTel Q&A:

DATE: 2023.08.31
TIME: 13:00 EDT/10:00 PDT
CALENDAR DETAILS: https://shorturl.at/dghy2

adrianamvillela, to opensource
@adrianamvillela@hachyderm.io avatar

✨Learn how to contribute to OpenTelemetry!! ✨Are you an practitioner? Have you ever wanted to contribute back to OpenTelemetry, but didn’t know where to begin? Then check out my latest blog post! 👇

https://adri-v.medium.com/how-to-contribute-to-opentelemetry-5962e8b2447e

FTWynn, to random

The Speed of Light Will Cap Traditional Centralized #Observability

There are lots of reasons that DevOps teams have been looking into #o11y Pipelines and their in-flight processing possibilities: cost, performance. But I rarely hear about the hardest limit:

The Speed of Light

FTWynn,

Normally, the Red<>Green band is much wider for cloud migrations. I've shifted it specifically for #Observability, where data's half-life is short and its immediacy is vital.

Put simply, there is a hard limit to how much data you can get across the wire in the needed time.

FTWynn, to random

What are the most important inputs and outputs to track in Observability? A few ideas...

Inputs:

  • Data ingested
  • Time spent building/updating tools

Outputs:

  • Costs
  • MTTR
  • Bugs caught
  • Time spent in tools
  • o11y support requests
  • of user queries and dashboards

#observability #o11y

Martindotnet, to random
@Martindotnet@hachyderm.io avatar

Going live soon with @jessitron giving another look at and the demo app to see what other ways we can break it! because without is just Chaos...

https://twitch.tv/ssObservability

FTWynn, to random

Because Observability is a meta-practice, at what point does it deserve focused attention instead of being an afterthought? Launch? A scale threshold? Downtime thresholds? Dev burnout?

#observability #o11y

FTWynn, to random

In order to improve your Observability practice, you first need to write down what you want from it. Otherwise, the path beyond Collect > Search > Display becomes impossibly murky.

#observability #o11y

adrianamvillela, to python
@adrianamvillela@hachyderm.io avatar

Super stoked to have had my latest blog post (OTel Python Logging Auto-Instrumentation with the OTel Operator ) featured on O11y News. Check it out!

https://o11y.news/2023-08-21/

FTWynn, to random

I'm looking forward to how all the Observability tools change as OTel gains more and more mindshare. If collection isn't the primary value for a vendor, what is?

#observability #o11y #opentelemetry

FTWynn, to random

Building an Observability practice that's dependent on large amounts of egress, the highest margin product in cloud, is not sustainable.

FTWynn, to random

Searching for the Ideal Orange Peeling Method Has Taught Me the 4 Most Important Principles in Observability

Many of us have hobbies. Many of them are beautiful or useful to the world. Mine is not.

My personal white whale is to find the perfect way to peel an orange. Years of research and experimentation have not yet led to an ideal solution, but that's also precisely why it's taught me 4 key principles about Observability.

1/5

jcuff, to linux
@jcuff@mastodon.mit.edu avatar

2023 sysadmin resume:

“Installed and orchestrated a 1,037,089 node cluster for a popular iOS app to autodetect on the in an afternoon. Deployed (v0.014), and wrote a novel global pipeline in to stream over 5,000PB/minute of and data to a set of fifty billion objects. RedHat certified.”

2001 sysadmin resume:

“Managed to exit once. RedHat certified”

FTWynn, to random

3 𝐓𝐚𝐤𝐞𝐚𝐰𝐚𝐲𝐬 𝐟𝐫𝐨𝐦 𝐓𝐡𝐢𝐬 𝐖𝐞𝐞𝐤'𝐬 𝐌𝐞𝐧𝐭𝐚𝐥 𝐌𝐨𝐝𝐞𝐥𝐢𝐧𝐠 𝐭𝐡𝐚𝐭 𝐖𝐢𝐥𝐥 𝐈𝐦𝐩𝐫𝐨𝐯𝐞 𝐘𝐨𝐮𝐫 𝐃𝐞𝐯𝐎𝐩𝐬 𝐎𝐛𝐬𝐞𝐫𝐯𝐚𝐛𝐢𝐥𝐢𝐭𝐲 𝐏𝐫𝐚𝐜𝐭𝐢𝐜𝐞

🚘 𝐖𝐡𝐞𝐧 𝐝𝐫𝐢𝐯𝐢𝐧𝐠 𝐚 𝐜𝐚𝐫, 𝐭𝐢𝐫𝐞 𝐡𝐞𝐚𝐥𝐭𝐡 𝐢𝐬 𝐯𝐢𝐭𝐚𝐥 𝐭𝐨 𝐝𝐫𝐢𝐯𝐢𝐧𝐠 𝐬𝐦𝐨𝐨𝐭𝐡𝐥𝐲, 𝐛𝐮𝐭 𝐭𝐢𝐫𝐞𝐬 𝐝𝐨𝐧'𝐭 𝐡𝐚𝐯𝐞 𝐚𝐧𝐲 𝐝𝐢𝐫𝐞𝐜𝐭 𝐦𝐞𝐭𝐫𝐢𝐜𝐬

  • Are there any "tires" in your system?
  • Are you using a proxy for the tires in your (...)
FTWynn,

🍲 𝐖𝐡𝐞𝐧 𝐥𝐨𝐨𝐤𝐢𝐧𝐠 𝐚𝐭 𝐡𝐨𝐰 𝐰𝐞𝐥𝐥 𝐩𝐫𝐞𝐩𝐚𝐫𝐢𝐧𝐠 𝐚 𝐦𝐞𝐚𝐥 𝐰𝐞𝐧𝐭, 𝐲𝐨𝐮 𝐬𝐡𝐨𝐮𝐥𝐝 𝐛𝐨𝐭𝐡 𝐚𝐬𝐤 𝐭𝐡𝐞 𝐞𝐚𝐭𝐞𝐫𝐬 𝐀𝐍𝐃 𝐥𝐨𝐨𝐤 𝐭𝐨 𝐬𝐞𝐞 𝐢𝐟 𝐭𝐡𝐞𝐲 𝐮𝐬𝐞𝐝 𝐞𝐱𝐭𝐫𝐚 𝐬𝐚𝐥𝐭 𝐨𝐫 𝐤𝐞𝐭𝐜𝐡𝐮𝐩

  • Are you blending feedback methods to see if your practice is working for those who use it?
  • Have you done surveys in addition to understanding usage metrics of the tools?

2/2

FTWynn, to random

What is your go-to mental model for thinking about Observability?

In talking with DevOps, SRE, and application teams, I find that there aren't enough very detailed mental models for how to think through what an Observability practice is and what it should do.

So here's a short list of models with potential:

  • Driving a car
  • Flying a plane
  • Cooking a big meal

What are other mental models you use to think through running your applications?

cdanis, to devops

Hello! I'm Chris, a Site Reliability Engineer ( ) working at @wikimediafoundation, the non-profit that administers and other projects. My posts here will focus on improving and , user-centric , working with and , sometimes or . I love silly and tinkering with systems. For fun I dabble in photography and gaming.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • JUstTest
  • kavyap
  • DreamBathrooms
  • thenastyranch
  • magazineikmin
  • tacticalgear
  • khanakhh
  • Youngstown
  • mdbf
  • slotface
  • rosin
  • everett
  • ngwrru68w68
  • Durango
  • megavids
  • InstantRegret
  • cubers
  • GTA5RPClips
  • cisconetworking
  • ethstaker
  • osvaldo12
  • modclub
  • normalnudes
  • provamag3
  • tester
  • anitta
  • Leos
  • lostlight
  • All magazines