In order to make you autonomous, it is possible to upgrade your data scientists on certain skills.
We can animate intra-company training in machine learning and deep learning, à la carte, according to your needs…
The intervention of experts and researchers at the top of the state of the art can also be considered within your structure.
Several formats are possible: short training courses lasting a few days, or coaching over a complete course sometimes spread over several months, with a few hours per week.
Introduction to Graph Mining
Mathematical modeling and random graph models
Spectral theory and spectral clustering algorithm
Gluttonous optimization of modularity
Generation of random graphs
Spectral decomposition and community detection
Larger Real Graph Readings
Applications to heterogeneous data analysis
NLP Preprocessing :: tokenisation, lemmatisation, part of speach, stopwords
Vectorization of the text
Similarities, distances applicable to the text
Clustering and Text Classification, Latent Dirichlet Allocation
Generative models and neural networks
One-hot encoder, tf-idf
Use of word embeddings (word2vec, doc2vec, fasttext, …)
Application to text classification and clustering
In this course, we introduce two paradigms of machine learning: statistical learning and e-learning. Statistical learning is the conventional ML framework in which a model or algorithm is refined using a static set of observed data. This ML is based on standard iid assumptions and reveals some of the limitations of current applications. We therefore introduce a more realistic framework called e-learning, in which each observation is made sequentially. We present algorithms capable of adapting to each new dataset. We introduce standard algorithms – e.g. exponential weighted average, bandit algorithms – and highlight practical performance over different scenarios.
Referral systems are well known thanks to the Netflix challenge. Historically, the challenge has focused on exploring approaches to deliver accurate personalized content by predicting users’ ratings of movies. All of the necessary information was based on previous ratings of movie users. Recommendation systems are also used for information search and content discovery: combined with querying and browsing, they allow users faced with a huge amount of information to navigate through that information efficiently and satisfactorily. Finally, the notion of real-time decision making combined with the desire to offer accurate but diverse content has contributed to the evolution of traditional recommendation system approaches such as collaborative filtering from the factorization machine to bandit models. This historical evolution of the recommendation system towards bandit models will be at the heart of this course. We will focus on explaining the main statistical approaches in both theoretical and algorithmic terms. Factorization machines and bandit models will be explained and tested on real data sets.
This training offers an overview of traditional methods, recent models and algorithms for analyzing textual data. You will discover here the main issues and levers of NLP, word vectorization and seq2seq. You will be able to practice these concepts using the Python language on practical work. A complete session will be dedicated to the creation of a chatbot and will take into account all the concepts learned during the NLP training.
The training will provide elements to understand deep learning and how it is implemented. There will be a practical part offering the possibility to manipulate these new concepts, solving cases involving the analysis of real data sets.
This course provides an overview of traditional and newer methods, models and algorithms for the analysis of large graphs. You will discover the main issues and levers of graph mining, random graph models and the theoretical and algorithmic knowledge of graph analysis and community detection. You will also have the opportunity to practice these concepts in practical work using the Python language.