title: Hugging Face in practice: A complete guide to Transformers library, Pipeline and pre-trained models | Daoman PythonAI description: Gain an in-depth understanding of the Hugging Face ecosystem, including how to use the Transformers library, Datasets library, Tokenizers library, and Pipeline. Covers complete practical content such as Chinese pre-training models, model fine-tuning, and data processing. keywords: [Hugging Face, Transformers, Pipeline, pre-trained model, fine-tuning, Datasets, Tokenizers, NLP, machine learning, deep learning]
Hugging Face in action: A complete guide to Transformers library, Pipeline and pre-trained models
Yesterday the AI product manager said to me, "Add a comment sentiment classification and it will be launched tomorrow"? If you write the Transformer model from 0, you won’t be able to catch up even if your hair falls out. But with Hugging Face? Make a prototype in 10 minutes!
Today's practical operation covers the core operations from one-click calling Pipeline to Chinese pre-training model implementation. Mirror sources, word segmentation, and fine-tuning pitfall prompts for domestic users have all been added👇
Hugging Face Ecosystem Quick Start
Hugging Face is no longer just a tool for "doing NLP", it is more like a "Swiss Army Knife Platform" in the AI era. From finding models and data to training, inference, and even online deployment, it covers almost the entire process of modern machine learning development. For those who are just getting started, just remember these four core components:
Installation and configuration (must read in China)
Installing the entire toolset is very simple and can be done with just one line using pip. But in order to make the download faster, it is strongly recommended to add the Tsinghua mirror source:
After installation, run this small script to verify that the environment is ready:
If you can print out expressions and text smoothly, your Hugging Face journey has officially begun.
Pipeline 10-minute prototype development
Pipeline is the most user-friendly entrance in the entire ecosystem - it packages all the tedious steps of word segmentation, model loading, reasoning, and parsing output together, allowing you to complete the task with one line of code. For the product manager’s need to “get it today and do it tomorrow”, Pipeline is a life-saving straw.
Commonly used Chinese task demonstrations
The following demonstrates several tasks with the highest frequency of demand, and all of them use pre-trained models optimized for Chinese to ensure reliable results.
1. Sentiment classification of Chinese comments
Directly loading a model specially trained on Chinese news comment sentiments, the accuracy is much higher than fine-tuning from a general model:
Positive or negative labels and confidence levels are automatically output here, and you almost don’t have to worry about any details inside the model.
2. Chinese Named Entity Recognition (NER)
Need to extract names of people, places, and organizations from a large piece of text? One Pipeline is done and startedgrouped_entitiesAfter the parameters are passed, consecutive subwords will be automatically merged. For example, "Chaoyang District, Beijing" will be recognized as a complete location.
The results of entity recognition are clear at a glance, which is very suitable for quickly building information extraction class functions.
Transformers core components revealed
Although Pipeline is convenient, if you need more fine-grained control - such as handling the input format of the model yourself, or want to extract the feature vector of the intermediate layer - you have to come into contact with the three core components of the Transformers library. They are the cornerstone of the entire library, and they all have "Auto" in their names, which means they can automatically match the correct implementation based on the model name.
Overview of the three core components
- AutoTokenizer: Automatically loads the tokenizer matched with the model, responsible for converting text into a sequence of numbers.
- AutoModelForXxx: Automatically load models with specific task headers, such as
AutoModelForSequenceClassificationIt is a model with a classification head. - AutoConfig: Read or modify the configuration of the model, such as adjusting the hidden layer size, number of categories, etc.
General usage examples
We jointly released the Harbin Institute of Technology iFlytekchinese-roberta-wwm-extFor example, this is a very excellent Chinese pre-training model, suitable for most Chinese understanding tasks. The following code demonstrates how to load the tokenizer and model, and then extract the semantic features of the text:
With these lines of code, you get a deep semantic representation of each text, which can be used in downstream tasks such as vector retrieval, clustering, and semantic similarity calculation.
Chinese pre-training model implemented in practice
There is an iron rule when doing Chinese NLP: Never directly use English pre-trained models to run Chinese data. The vocabulary list of the English model is completely different from that of Chinese. Forcibly using it will cause a large number of rare words to be cut into single characters or even garbled characters. The effect can be imagined. Fortunately, we have many models that have been carefully trained on Chinese corpus. The following table summarizes the most commonly used models with the best reputation on Hugging Face Hub today:
Chinese e-commerce review classification implemented
Suppose you need to process a batch of e-commerce user reviews, you canchinese-roberta-wwm-extOn the basis of adding a simple classification header, build a two-classification model (such as positive/negative). This not only takes advantage of the powerful semantic understanding capabilities of the pre-trained model, but also adapts to your own classification tasks:
The only step left is fine-tuning. The next section will provide a minimalist fine-tuning framework for you to use directly.
Fine-tuning minimalist principles and fast frameworks
The essence of fine-tuning is to "tune" a general model that "knows astronomy from above and geography from below" and use your own data to become an expert in a certain vertical field. But before you do it, please remember the following three principles, which can help you avoid 80% of pitfalls:
✅ Fine-tune the 3 principles before stepping on the trap
- Data requirements: Prepare at least 100 pieces of high-quality annotated data. If the number is insufficient, priority should be given to Prompt Engineering or Few-shot Learning.
- Parameter Tuning: Start with small batches (e.g.
per_device_train_batch_size=8), and adopt a smaller learning rate (the range of 2e-5~5e-5 is safer). - Validation set: A part of the data must be separated as a verification set. During training, the performance of the verification set should be monitored in real time to prevent the model from "memorizing" the training data (overfitting).
🚀 Quick framework (just replace your data)
The code below is a complete but extremely streamlined fine-tuning process, you just need to replace it with your own CSV or dictionaryyour_dataThe rest of the content can basically be copied:
The entire framework is clear and clear, and after adding comments, even if you are not familiar with Hugging Face, you can get started quickly. After the model training is completed, a model file that can be directly deployed will be generated in the directory you specify.
Model deployment and inference optimization
The trained model cannot be put online directly with the original PyTorch file - otherwise the inference speed will be slow, the memory will be large, and the user experience will be greatly reduced. We need to do some lightweight optimization to make the model run fast and stable in the production environment.
🎯 Local rapid deployment (using Pipeline)
Saved models can be loaded directly with Pipeline, just like using the finished product out of the box:
One line of code builds the service, and the next line can provide the API to the outside world, which is very convenient.
✨ 3 commonly used inference optimizations
Depending on your deployment environment, you can choose different optimization strategies:
These optimization methods have detailed code examples in the official documentation, and the conversion usually only takes a few lines of code to complete. Don’t forget to optimize this step before going online.
Summarize
Hugging Face has lowered the threshold for natural language processing from "master's thesis level" to "you can get started with Python". To sum up, there are four sentences:
- **Want to quickly test your idea? ** Using Pipeline, the prototype was produced in 10 minutes. The product manager was surprised when he saw it.
- **Do Chinese tasks? ** Resolutely avoid English models and directly search for pre-trained models with "chinese" on Hub.
- **Need a dedicated model? ** If the data is enough, fine-tune it. It can be easily done according to the three principles and quick framework provided above.
- **Ready to go online? ** Don’t run naked, do a round of optimization with FP16, INT8 or ONNX first.
With this toolbox, next time you encounter the need to "go online tomorrow", you can calmly reply: "Give me a cup of coffee."
Further reading
📂 Stage: Stage 4 - Pre-training model and transfer learning (application)

