Mehrad,
@Mehrad@fosstodon.org avatar

I'm trying to integrate some public air quality data into my study. During a sanity check of the data I realized 3 of the measurement columns contain negative values! Does anyone have any idea if having negative value in such measurements is valid and how they should be interpreted?

Contacting the data manager is not as easy and might take me a week or two of emailing to get some answer. I wonder if folks here on fediverse have a quick answer.

beps,

@Mehrad Negative data are definitely not valid data and should be discarded. They can also be an indication of other issues in that specific dataset.

Usually data acquired by monitoring stations run by public institutions go through validation processes which should rule out those kind of errors. Often time series are provided with a flag stating if the single data point is validated or not. I guess unfortunately this is not the case.

Mehrad,
@Mehrad@fosstodon.org avatar

@beps
Thanks for the comment. This data is from public institute 😬 I got it from the Finnish Meteorological Institute OpenData repo. I guess it is time to contact them and discuss the matter in details.

Thanks again.

beps,

@Mehrad
You are wellcome.

I also forgot to mention that some institutions ask for money for validated data due to the cost of post processing (!).
Though I do not know anything about the Finnish dataset you are working with.

0rkk0,
@0rkk0@fosstodon.org avatar

@Mehrad Common sense tells me the negative measure can be bad data/uncalibrated data .. However, on a very unrelated-topic, I see the notion of *differential mass *being +/- depending on particle density http://manalis-lab.mit.edu/publications/godin_APL_2007.pdf
Not so much more to tell..

tomstafford,
@tomstafford@mastodon.online avatar

@Mehrad I've done a bit of air quality data analysis and never had negatives. Sounds like it could be a due to calibration issues?

Mehrad,
@Mehrad@fosstodon.org avatar

@tomstafford Thanks. That was my hunch as well. The issue is that the data is historical and from a decades back. The good news is that they are all from one particular weather station. So if I can convince them to give me some info about maintenance dates, I might be able to compare their distribution and shifted means to assess which time interval is problematic and fix it using the "correct" intervals.

Thanks for the comment 🍻

jospueyo,
@jospueyo@fosstodon.org avatar

@Mehrad @tomstafford if all negative values are exactly -1, it can be the method to express missing values.

Mehrad,
@Mehrad@fosstodon.org avatar

@jospueyo
True, but there are already missing values (NA) in all those columns, and also the minimum number goes to -30. This is becoming a mystery 😅
@tomstafford

defuneste,
@defuneste@fosstodon.org avatar

@Mehrad @jospueyo @tomstafford

I do not know the specific answer but I have definitely see some sensors providing this kind of errors, it could also be an error when the sensor transmit data 🤷‍♂️

doomsdayrs,

deleted_by_author

  • Loading...
  • Mehrad,
    @Mehrad@fosstodon.org avatar

    @doomsdayrs
    thanks for the comment. Can you elaborate more? I mean how a fire can cause negative values in CO and NO reading considering the unit of the data (µg/m3). The data is not log transformed as far as I can comprehend, and lack of something in air should result in zero. Right? What am I missing?

    I'm completely uneducated about air quality measurement methods, so I'm just looking for an explanation 😅

  • All
  • Subscribed
  • Moderated
  • Favorites
  • datascience
  • Durango
  • DreamBathrooms
  • thenastyranch
  • magazineikmin
  • tacticalgear
  • khanakhh
  • Youngstown
  • mdbf
  • slotface
  • rosin
  • everett
  • ngwrru68w68
  • kavyap
  • InstantRegret
  • JUstTest
  • cubers
  • GTA5RPClips
  • cisconetworking
  • ethstaker
  • osvaldo12
  • modclub
  • normalnudes
  • provamag3
  • tester
  • anitta
  • Leos
  • megavids
  • lostlight
  • All magazines