A term for "a series with lost data"

< Previous | Next >

vgiv

Member
Russian
Let us suppose that you had a series of data (e.g. a set of daily measurements of temperature) and have randomly removed data points from it (a pretty large part of data, e.g., 95%). How can you call the resulting series? "A decimated series". " a narrowed series", "a winnowed series", or something else?

I'm sure that there is a special term for it, but cannot find it.

Thank you in advance,
Vladimir.
 
  • entangledbank

    Senior Member
    English - South-East England
    I don't know of a term for this. A sequence or matrix that only has a few non-empty or non-zero terms is called sparse.
     

    vgiv

    Member
    Russian
    The translator of my paper to English has wroten "a winnowed series", but I rather doubt.

    entangledbank, "a sparse series" is a bit different term. I want to stress that a part of my data is lost, not zeroed.

    suzi br, yes, one can write, say, "a randomly selected subseries", but it is too long.
     

    entangledbank

    Senior Member
    English - South-East England
    It appears there is a winnowing algorithm in machine learning, but it doesn't fit what you want. And 'decimate' is used to mean reducing the sampling rate, which is similar but again not exactly what you want.
     

    Edinburgher

    Senior Member
    German/English bilingual
    I suppose you could call the sequence sparsely-sampled or perhaps compacted. Presumably the purpose of the compaction is to avoid storing unnecessary data, while at the same time taking care you don't lose too much significant information about the continuous function you are sampling. For example, if you are measuring the heights of ocean tides at a coastal location, for the purpose of carrying out a harmonic analysis, you really don't need to sample it every second, or every minute. Every ten minutes will probably suffice.
    How What can you call the resulting series sequence?
    Strictly-speaking a series is a sum. Not in everyday language, but in the language of mathematics.
     

    entangledbank

    Senior Member
    English - South-East England
    Yes, I've been googling for things like "decimate" "sequence", because "sequence" gives much more relevant results than "series".
     

    vgiv

    Member
    Russian
    I suppose you could call the sequence sparsely-sampled or perhaps compacted. Presumably the purpose of the compaction is to avoid storing unnecessary data, while at the same time taking care you don't lose too much significant information about the continuous function you are sampling. For example, if you are measuring the heights of ocean tides at a coastal location, for the purpose of carrying out a harmonic analysis, you really don't need to sample it every second, or every minute. Every ten minutes will probably suffice.

    Alas, it's not my case, I think. I'm speaking about chaotically corrupted data rather than intentionally compressed or compacted ones.

    Strictly-speaking a series is a sum. Not in everyday language, but in the language of mathematics.
    Sums? I've never heard about it. I don't know exactly about mathematics, but in astronomy the term "time series" are widely used and it means just "a sequence of measurements" (e.g. [1601.03536] Time series analysis of long-term photometry of BM Canum Venaticorum)
     

    JulianStuart

    Senior Member
    English (UK then US)
    If I had removed the data points from the set, I would refer to it as a "reduced data set". If it has been done to make the set smaller, by some defined protocol, it could be a "compressed data set". The context isn't very clear (your title uses "lost" data which implies accidental:)) - there are many ways to "compress" sets of numbers, some lossy, some lossless. (Data compression - Wikipedia, the free encyclopedia)

    If you have (effectively) randomly selected 5% of the data to retain, then I have no idea what that might be called!
     
    Last edited:

    vgiv

    Member
    Russian
    your title uses "lost" data which implies accidental:)
    Yes, it is just the case! Imagine that a scientist measured an air temperature once a day. But a good deal of his records was lost or corrupted, so now only ocasional values of temperature (say, one or two a month) is available. There was a whole time series (once a day) and now one has only randomly selected subset of it. In Russian one can call the resulting series "прореженный (ряд)". Google Translate gives for it "decimated", "thinned" or "sparse (series)", but I'm not sure in any of these terms.
     
    Last edited:

    JulianStuart

    Senior Member
    English (UK then US)
    I was confused by your use of "randomly selected" - I thought the selection was an action performed by the researcher(!)
    I've not encountered this situation very often so know of no "official" name for it. I'd refer to a "residual" data set or the "residue" or "remains" of the original data set:) As an adjective, one might contemplate "vestigial":eek:
    (from the dictionary under vestige:
    a mark, trace, or visible evidence of something that is no longer present or in existence:the last vestiges of a once great empire.dataset
     

    entangledbank

    Senior Member
    English - South-East England
    I can think of many things such a scientist would call that data, but none would be printable in a scientific journal. :)
     

    vgiv

    Member
    Russian
    I was confused by your use of "randomly selected" - I thought the selection was an action performed by the researcher(!)
    (from the dictionary under vestige:
    From the viewpoint of data analysis there is no difference between the case with a researcher who rolls a dice to select some subset of data and any other stochastic process:)

    I've not encountered this situation very often so know of no "official" name for it. I'd refer to a "residual" data set or the "residue" or "remains" of the original data set:) As an adjective, one might contemplate "vestigial":eek:
    (from the dictionary under vestige:
    Yes, "vestigial" sounds great:) And "residual series" seems to be good too. Thanks a lot.
     

    Edinburgher

    Senior Member
    German/English bilingual
    Perhaps "trace data" might also serve. We use "trace" sometimes for when there is only a tiny bit of something, often when the quantity is too small to measure.
     

    JulianStuart

    Senior Member
    English (UK then US)
    From the viewpoint of data analysis there is no difference between the case with a researcher who rolls a dice to select some subset of data and any other stochastic process:)
    The resulting datasets may behave the same way, but the dictionary entry for select sounds like it's the wrong word for the situation:) Preference seems like an antonym for random!
    select
    • to choose in preference;
      pick:Only the best students were selected for admission.
     

    Myridon

    Senior Member
    English - US
    From the viewpoint of data analysis there is no difference between the case with a researcher who rolls a dice to select some subset of data and any other stochastic process:)
    Throwing out the results that don't support your hypothesis is exactly the same as randomly selecting data? ;)
     

    PaulQ

    Senior Member
    UK
    English - England
    It strikes me that what is left is not a series, it is simply "the remaining/surviving data".
     

    Edinburgher

    Senior Member
    German/English bilingual
    Throwing out the results that don't support your hypothesis is exactly the same as randomly selecting data? ;)
    Myridon, what hypothesis? I've said that data points were removed randomly.
    vgiv: Myridon was only joking, but you should not use the word "removed" here, because it carries an implication that data were deliberately selected for removal.

    Myridon: See vgiv's reply #8 which, owing to finger-trouble, appeared in the box quoting me instead of below it, making it look as though it was part of what I had written (and was therefore uninteresting:(). The missing data were apparently obscured/corrupted by some natural interfering phenomenon.
     

    vgiv

    Member
    Russian
    Thanks to everybody! I cannot say that everything is crystal clear now, but I got some ideas.
     
    < Previous | Next >
    Top