I'm sure that there is a special term for it, but cannot find it.

Thank you in advance,

Vladimir.

- Thread starter vgiv
- Start date

I'm sure that there is a special term for it, but cannot find it.

Thank you in advance,

Vladimir.

Strictly-speaking a series is a sum. Not in everyday language, but in the language of mathematics.~~How~~Whatcan you call the resulting~~series~~sequence?

Alas, it's not my case, I think. I'm speaking about chaotically corrupted data rather than intentionally compressed or compacted ones.

Sums? I've never heard about it. I don't know exactly about mathematics, but in astronomy the term "time series" are widely used and it means just "a sequence of measurements" (e.g. [1601.03536] Time series analysis of long-term photometry of BM Canum Venaticorum)Strictly-speaking a series is a sum. Not in everyday language, but in the language of mathematics.

If I had removed the data points from the set, I would refer to it as a "reduced data set". If it has been done to make the set smaller, by some defined protocol, it could be a "compressed data set". The context isn't very clear (your title uses "lost" data which implies accidental) - there are many ways to "compress" sets of numbers, some lossy, some lossless. (Data compression - Wikipedia, the free encyclopedia)

If you have (effectively) randomly selected 5% of the data to retain, then I have no idea what that might be called!

If you have (effectively) randomly selected 5% of the data to retain, then I have no idea what that might be called!

Last edited:

Yes, it is just the case! Imagine that a scientist measured an air temperature once a day. But a good deal of his records was lost or corrupted, so now only ocasional values of temperature (say, one or two a month) is available. There was a whole time series (once a day) and now one has only randomly selected subset of it. In Russian one can call the resulting series "прореженный (ряд)". Google Translate gives for it "decimated", "thinned" or "sparse (series)", but I'm not sure in any of these terms.your title uses "lost" data which implies accidental

Last edited:

I've not encountered this situation very often so know of no "official" name for it. I'd refer to a "residual" data set or the "residue" or "remains" of the original data set As an adjective, one might contemplate "vestigial"

(from the dictionary under vestige:

a mark, trace, or visible evidence of something that is no longer present or in existence:the last vestiges of a once great~~empire~~.dataset

From the viewpoint of data analysis there is no difference between the case with a researcher who rolls a dice to select some subset of data and any other stochastic processI was confused by your use of "randomly selected" - I thought the selection was an action performed by the researcher(!)

(from the dictionary under vestige:

Yes, "vestigial" sounds great And "residual series" seems to be good too. Thanks a lot.I've not encountered this situation very often so know of no "official" name for it. I'd refer to a "residual" data set or the "residue" or "remains" of the original data set As an adjective, one might contemplate "vestigial"

(from the dictionary under vestige:

The resulting datasets may behave the same way, but the dictionary entry for select sounds like it's the wrong word for the situation Preference seems like an antonym for random!From the viewpoint of data analysis there is no difference between the case with a researcher who rolls a dice to select some subset of data and any other stochastic process

select

- to choose in preference;

pick:Only the best students were selected for admission.

Throwing out the results that don't support your hypothesis is exactly the same as randomly selecting data?From the viewpoint of data analysis there is no difference between the case with a researcher who rolls a dice to select some subset of data and any other stochastic process

Throwing out the results that don't support your hypothesis is exactly the same as randomly selecting data?

vgiv: Myridon was only joking, but you should not use the word "removed" here, because it carries an implication that data were deliberately selected for removal.Myridon, what hypothesis? I've said that data points were removed randomly.

Myridon: See vgiv's reply #8 which, owing to finger-trouble, appeared in the box quoting me instead of below it, making it look as though it was part of what I had written (and was therefore uninteresting). The missing data were apparently obscured/corrupted by some natural interfering phenomenon.