Skip to Main content Skip to Navigation
Conference papers

Are Data Science Pipelines Fuzzy Queries?

Genoveva Vargas-Solar 1, 2
2 BD - Base de Données
LIRIS - Laboratoire d'InfoRmatique en Image et Systèmes d'information
Abstract : This short paper states a scientific position that proposes a new vision of data science pipelines defined as queries, namely data science queries (DSQ's). Different from classic queries, the results of DSQ's are not only data but also estimated models with associated error and performance scores. Besides, queries can have different attainable results according to the algorithms that implement them behind the scenes. A data scientist must choose the best or most adapted result according to given expectations related to a target domain. In this sense, it is possible to consider DSQ's as fuzzy queries that estimate results and choose those close to a combination of expected criteria. The paper discusses the aspects to consider for modelling a data science pipelines as fuzzy queries and possible research directions.
Document type :
Conference papers
Complete list of metadata

https://hal.archives-ouvertes.fr/hal-03039443
Contributor : Genoveva Vargas-Solar Connect in order to contact the contributor
Submitted on : Wednesday, December 16, 2020 - 6:10:04 PM
Last modification on : Tuesday, June 1, 2021 - 2:08:08 PM
Long-term archiving on: : Wednesday, March 17, 2021 - 6:10:36 PM

File

FSASSE_HPCS_2020.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-03039443, version 1

Citation

Genoveva Vargas-Solar. Are Data Science Pipelines Fuzzy Queries?. 2020 International Conference on High Performance Computing & Simulation, Jan 2021, Barcelona, Spain. ⟨hal-03039443⟩

Share

Metrics

Record views

104

Files downloads

82