qqqqqq
注册 登录
EN | CN
  • 首页
  • 论文提交
  • 论文浏览
  • 论文检索
  • 个人中心
  • 帮助
按提交时间
  • 3
按主题分类
  • 3
按作者
  • 3
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
按机构
  • 3
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
当前资源共 3条
隐藏摘要 点击量 时间 下载量
  • 1. ChinaXiv:202211.00439
    下载全文

    Canonical Workflows to Make Data FAIR

    分类: 计算机科学 >> 计算机科学的集成理论 提交时间: 2022-11-28 合作期刊: 《数据智能(英文)》

    Peter, Wittenburg Alex, Hardisty Yann, Le Franc Amirpasha, Mozaffari Limor, Peer Nikolay, A. Skvortsov Zhiming, Zhao Alessandro, Spinuso

    摘要: The FAIR principles have been accepted globally as guidelines for improving data-driven science and data management practices, yet the incentives for researchers to change their practices are presently weak. In addition, data-driven science has been slow to embrace workflow technology despite clear evidence of recurring practices. To overcome these challenges, the Canonical Workflow Frameworks for Research (CWFR) initiative suggests a large-scale introduction of self-documenting workflow scripts to automate recurring processes or fragments thereof. This standardised approach, with FAIR Digital Objects as anchors, will be a significant milestone in the transition to FAIR data without adding additional load onto the researchers who stand to benefit most from it. This paper describes the CWFR approach and the activities of the CWFR initiative over the course of the last year or so, highlights several projects that hold promise for the CWFR approaches, including Galaxy, Jupyter Notebook, and RO Crate, and concludes with an assessment of the state of the field and the challenges ahead.

     点击量 1319  下载量 417  评论 0
  • 2. ChinaXiv:202211.00440
    下载全文

    HPC-oriented Canonical Workflows for Machine Learning Applications in Climate and Weather Prediction

    分类: 计算机科学 >> 计算机科学的集成理论 提交时间: 2022-11-28 合作期刊: 《数据智能(英文)》

    Amirpasha, Mozaffari Michael, Langguth Bing, Gong Jessica, Ahring Adrian, Rojas Campos Pascal, Nieters Otoniel, José Campos Escobar Martin, Wittenbrink Peter, Baumann Martin, G. Schultz

    摘要: Machine learning (ML) applications in weather and climate are gaining momentum as big data and the immense increase in High-performance computing (HPC) power are paving the way. Ensuring FAIR data and reproducible ML practices are significant challenges for Earth system researchers. Even though the FAIR principle is well known to many scientists, research communities are slow to adopt them. Canonical Workflow Framework for Research (CWFR) provides a platform to ensure the FAIRness and reproducibility of these practices without overwhelming researchers. This conceptual paper envisions a holistic CWFR approach towards ML applications in weather and climate, focusing on HPC and big data. Specifically, we discuss Fair Digital Object (FDO) and Research Object (RO) in the DeepRain project to achieve granular reproducibility. DeepRain is a project that aims to improve precipitation forecast in Germany by using ML. Our concept envisages the raster datacube to provide data harmonization and fast and scalable data access. We suggest the Juypter notebook as a single reproducible experiment. In addition, we envision JuypterHub as a scalable and distributed central platform that connects all these elements and the HPC resources to the researchers via an easy-to-use graphical interface.

     点击量 1808  下载量 555  评论 0
  • 3. ChinaXiv:202211.00441
    下载全文

    Enabling Canonical Analysis Workflows Documented Data Harmonization on Global Air Quality Data

    分类: 计算机科学 >> 计算机科学的集成理论 提交时间: 2022-11-28 合作期刊: 《数据智能(英文)》

    Sabine, Schröder Eleonora, Epp Amirpasha, Mozaffari Mathilde, Romberg Niklas, Selke Martin, G. Schultz

    摘要: Data harmonization and documentation of the data processing are essential prerequisites for enabling Canonical Analysis Workflows. The recently revised Terabyte-scale air quality database system, which the Tropospheric Ozone Assessment Report (TOAR) created, contains one of the worlds largest collections of near-surface air quality measurements and considers FAIR data principles as an integral part. A special feature of our data service is the on-demand processing and product generation of several air quality metrics directly from the underlying database. In this paper, we show that the necessary data harmonization for establishing such online analysis services goes much deeper than the obvious issues of common data formats, variable names, and measurement units, and we explore how the generation of FAIR Digital Objects (FDO) in combination with automatically generated documentation may support Canonical Analysis Workflows for air quality and related data.

     点击量 1928  下载量 600  评论 0
友情链接 : ChinaXiv PubScholar 哲学社会科学预印本
  • 运营单位: 中国科学院文献情报中心
  • 制作维护:中国科学院文献情报中心知识系统部
  • 邮箱: eprint@mail.las.ac.cn
  • 地址:北京中关村北四环西路33号
招募预印本评审专家 许可声明 法律声明

京ICP备05002861号-25 | 京公网安备110402500046号
版权所有© 2016 中国科学院文献情报中心