ChinaXiv.org 中国科学院科技论文预发布平台

按提交时间

按主题分类

按作者

按机构

Artificial Intelligence Engineering Department, Near East University, Nicosia, Cyprus, Mersin 10, Turkey
1
Beijing University of Chemical Technology
1
College of Aerospace Engineering, Chongqing University, No. 174, Shazheng Street, Shapingba District, Chongqing, 400044, China
1
DONA Foundation, C/O Université de Genève, Rue du Général-Dufour 24, 1204 Genève, Switzerland
1
Dell Medical School, University of Texas at Austin, Austin, Texas 78701-1996, USA
1
Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York 10065, USA
1
Deutscher Wetterdienst, 63067 Offenbach am Main, Germany
1
Electrical and Electronic Engineering Department, Near East University, Nicosia, Cyprus, Mersin 10, Turkey
1
FDO Forum, Gemeindweg 55, 47533 Kleve, Germany
1
Forschungszentrum Jülich GmbH, 52425 Jülich, Germany
1
Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, New York 10065, USA
1
Information Systems Engineering Department, Near East University, Nicosia, Cyprus, Mersin 10, Turkey
1
Jacobs University Bremen, 28759 Bremen, Germany
1
Osnabrück University, 49074 Osnabrück, Germany
1
Renmin University of China
1
Research Centre for AI and IoT, Near East University, Nicosia, Cyprus, Mersin 10, Turkey
1
School of Artificial Intelligence, Guilin University of Electronic Technology, No. 1 Jinji Road, Guilin, Guangxi, 541004, China
1
School of Informatics, Computing, and Engineering, Indiana University, Bloomington, IN 47405-7000, USA
1
School of Informatics, University of Texas at Austin, Austin, Texas 78712-1139, USA
1
School of Mechanical and Electrical Engineering, Guilin University of Electronic technology, No. 1 Jinji Road, Guilin, Guangxi, 541004, China
1
Software Engineering Department, Near East University, Nicosia, Cyprus, Mersin 10, Turkey
1
Wiley, 9600 Garsington Road, Oxford OX4 2DQ, UK
1
bol.com, Papendorpseweg 100, 3528 BJ Utrecht, The Netherlands
1

当前资源共 8条

隐藏摘要

点击量

时间

下载量

1. ChinaXiv:202504.00193
下载全文

The Thermal Contact Resistance Dataset and the Artificial Intelligence-Driven Prediction of Thermal Contact Resistance in Multi-material Systems

分类：动力与电气工程 >> 工程热物理学分类：计算机科学 >> 计算机应用技术提交时间： 2025-04-11

Man Zhou Pei Yao Guo Zhuo Yan He Ping Zhang Tao Lin

摘要： In the current era of artificial intelligence, the advancement of high-performance computing based on electronic devices is hindered by thermal contact resistance. To accurately predict this resistance, we established a comprehensive database derived from extensive experimental work documented in previous studies. By employing machine learning algorithms, we developed a prediction model for thermal contact resistance that utilizes this dataset. This model can predict the thermal contact resistance among all learned materials, demonstrating a significant degree of general applicability. Our model shows strong performance on the test set (with a coefficient of determination of 0.982) , reflecting a high level of predictive accuracy. Additionally, the interpretability analyses conducted on the machine learning model are consistent with established theories of thermal contact resistance, further confirming the model’s accuracy. We anticipate that this database will support the development of thermal contact resistance prediction models and that our model will enhance the precision of thermal contact resistance predictions.

同行评议状态:待评议

点击量 1112 下载量 191 评论 0
2. ChinaXiv:202401.00038
下载全文

Predicting League of Legends Match Results Based on Machine

分类：计算机科学 >> 自然语言理解与机器翻译提交时间： 2024-01-03

Wang Donghua

摘要： League of Legends (LoL) is a highly popular multiplayer online competitive game, featuring intricate game mechanics and team cooperation, making the prediction of match outcomes a challenging task. This study utilizes a dataset from Kaggle, comprising 9,879 ranked matches ranging from Diamond I to Master tier, to build a machine learning model predicting the ultimate winner, either the blue or red team, based on the features of the first 10 minutes of gameplay. Through steps such as data loading, preprocessing, and feature engineering, we provided effective inputs for the model. For model selection, we opted for the Logistic Regression algorithm, achieving a model accuracy of 0.7277 through data splitting and training. This accuracy robustly supports predictions of the winning side, whether blue or red. However, to further enhance model performance, we recommend exploring additional feature en#2;gineering methods, investigating alternative machine learning algorithms, and fine-tuning hyperpa#2;rameters. The introduction of deep learning models is also a promising avenue to better capture the complex relationships within the game. Through these improvements, we anticipate increasing the models predictive accuracy for future matches, offering valuable insights for game development and enhancement.

同行评议状态:待评议

点击量 2798 下载量 637 评论 0
3. ChinaXiv:202311.00040
下载全文

A Novel Framework for Future Natural Language Processing From a Database Perspective

分类：计算机科学 >> 自然语言理解与机器翻译提交时间： 2023-11-01

Limin Zhang

摘要： Most research and applications on natural language still concentrate on its superficial features and structures. However, natural language is essentially a way of encoding information and knowledge. Thus, the focus should be on what is encoded and how it is encoded. In line with this, we suggest a database-based approach for natural language processing that emulates the encoding of information and knowledge to build models. Based on these models, 1) generating sentences becomes akin to reading data from the models (or databases) and encoding it following some rules; 2) understanding sentences involves decoding rules and a series of boolean operations on the databases; 3) learning can be accomplished by writing on the databases. Our method closely mirrors how the human brain processes information, offering excellent interpretability and expandability.

同行评议状态:待评议

点击量 2667 下载量 716 评论 0
4. ChinaXiv:202211.00424
下载全文

Comparative Evaluation and Comprehensive Analysis of Machine Learning Models for Regression Problems

分类：计算机科学 >> 计算机科学的集成理论提交时间： 2022-11-28 合作期刊: 《数据智能（英文）》

Boran, Sekeroglu Yoney, Kirsal Ever Kamil, Dimililer Fadi, Al-Turjman

摘要： Artificial intelligence and machine learning applications are of significant importance almost in every field of human life to solve problems or support human experts. However, the determination of the machine learning model to achieve a superior result for a particular problem within the wide real-life application areas is still a challenging task for researchers. The success of a model could be affected by several factors such as dataset characteristics, training strategy and model responses. Therefore, a comprehensive analysis is required to determine model ability and the efficiency of the considered strategies. This study implemented ten benchmark machine learning models on seventeen varied datasets. Experiments are performed using four different training strategies 60:40, 70:30, and 80:20 hold-out and five-fold cross-validation techniques. We used three evaluation metrics to evaluate the experimental results: mean squared error, mean absolute error, and coefficient of determination (R2 score). The considered models are analyzed, and each model's advantages, disadvantages, and data dependencies are indicated. As a result of performed excess number of experiments, the deep Long-Short Term Memory (LSTM) neural network outperformed other considered models, namely, decision tree, linear regression, support vector regression with a linear and radial basis function kernels, random forest, gradient boosting, extreme gradient boosting, shallow neural network, and deep neural network. It has also been shown that cross-validation has a tremendous impact on the results of the experiments and should be considered for the model evaluation in regression studies where data mining or selection is not performed.

点击量 4526 下载量 878 评论 0
5. ChinaXiv:202211.00440
下载全文

HPC-oriented Canonical Workflows for Machine Learning Applications in Climate and Weather Prediction

分类：计算机科学 >> 计算机科学的集成理论提交时间： 2022-11-28 合作期刊: 《数据智能（英文）》

Amirpasha, Mozaffari Michael, Langguth Bing, Gong Jessica, Ahring Adrian, Rojas Campos Pascal, Nieters Otoniel, José Campos Escobar Martin, Wittenbrink Peter, Baumann Martin, G. Schultz

摘要： Machine learning (ML) applications in weather and climate are gaining momentum as big data and the immense increase in High-performance computing (HPC) power are paving the way. Ensuring FAIR data and reproducible ML practices are significant challenges for Earth system researchers. Even though the FAIR principle is well known to many scientists, research communities are slow to adopt them. Canonical Workflow Framework for Research (CWFR) provides a platform to ensure the FAIRness and reproducibility of these practices without overwhelming researchers. This conceptual paper envisions a holistic CWFR approach towards ML applications in weather and climate, focusing on HPC and big data. Specifically, we discuss Fair Digital Object (FDO) and Research Object (RO) in the DeepRain project to achieve granular reproducibility. DeepRain is a project that aims to improve precipitation forecast in Germany by using ML. Our concept envisages the raster datacube to provide data harmonization and fast and scalable data access. We suggest the Juypter notebook as a single reproducible experiment. In addition, we envision JuypterHub as a scalable and distributed central platform that connects all these elements and the HPC resources to the researchers via an easy-to-use graphical interface.

点击量 1807 下载量 554 评论 0
6. ChinaXiv:202211.00447
下载全文

Canonical Workflow for Machine Learning Tasks

分类：计算机科学 >> 计算机科学的集成理论提交时间： 2022-11-28 合作期刊: 《数据智能（英文）》

Christophe, Blanchi Binyam, Gebre Peter, Wittenburg

摘要： There is a huge gap between (1) the state of workflow technology on the one hand and the practices in the many labs working with data driven methods on the other and (2) the awareness of the FAIR principles and the lack of changes in practices during the last 5 years. The CWFR concept has been defined which is meant to combine these two intentions, increasing the use of workflow technology and improving FAIR compliance. In the study described in this paper we indicate how this could be applied to machine learning which is now used by almost all research disciplines with the well-known effects of a huge lack of repeatability and reproducibility. Researchers will only change practices if they can work efficiently and are not loaded with additional tasks. A comprehensive CWFR framework would be an umbrella for all steps that need to be carried out to do machine learning on selected data collections and immediately create a comprehensive and FAIR compliant documentation. The researcher is guided by such a framework and information once entered can easily be shared and reused. The many iterations normally required in machine learning can be dealt with efficiently using CWFR methods. Libraries of components that can be easily orchestrated using FAIR Digital Objects as a common entity to document all actions and to exchange information between steps without the researcher needing to understand anything about PIDs and FDO details is probably the way to increase efficiency in repeating research workflows. As the Galaxy project indicates, the availability of supporting tools will be important to let researchers use these methods. Other as the Galaxy framework suggests, however, it would be necessary to include all steps necessary for doing a machine learning task including those that require human interaction and to document all phases with the help of structured FDOs.

点击量 1219 下载量 384 评论 0
7. ChinaXiv:202211.00392
下载全文

Deep Learning with Heterogeneous Graph Embeddings for Mortality Prediction from Electronic Health Records

分类：计算机科学 >> 计算机科学的集成理论提交时间： 2022-11-27 合作期刊: 《数据智能（英文）》

Tingyi, Wanyan Hossein, Honarvar Ariful, Azad Ying, Ding Benjamin, S. Glicksberg

摘要： Computational prediction of in-hospital mortality in the setting of an intensive care unit can help clinical practitioners to guide care and make early decisions for interventions. As clinical data are complex and varied in their structure and components, continued innovation of modelling strategies is required to identify architectures that can best model outcomes. In this work, we trained a Heterogeneous Graph Model (HGM) on electronic health record (EHR) data and used the resulting embedding vector as additional information added to a Convolutional Neural Network (CNN) model for predicting in-hospital mortality. We show that the additional information provided by including time as a vector in the embedding captured the relationships between medical concepts, lab tests, and diagnoses, which enhanced predictive performance. We found that adding HGM to a CNN model increased the mortality prediction accuracy up to 4%. This framework served as a foundation for future experiments involving different EHR data types on important healthcare prediction tasks.

点击量 1539 下载量 496 评论 0
8. ChinaXiv:202211.00209
下载全文

The Open Data Challenge: An Analysis of 124,000 Data Availability Statements and an Ironic Lesson about Data Management Plans

分类：计算机科学 >> 计算机科学的集成理论提交时间： 2022-11-18 合作期刊: 《数据智能（英文）》

Graf, Chris Flanagan, Dave Wylic, Lisa Silver, Deirdre

摘要： Data availability statements can provide useful information about how researchers actually share research data. We used unsupervised machine learning to analyze 124,000 data availability statements submitted by research authors to 176 Wiley journals between 2013 and 2019. We categorized the data availability statements, and looked at trends over time. We found expected increases in the number of data availability statements submitted over time, and marked increases that correlate with policy changes made by journals. Our open data challenge becomes to use what we have learned to present researchers with relevant and easy options that help them to share and make an impact with new research data.

点击量 1360 下载量 437 评论 0