Why customise small language models
Large language models are trained on broad, general-purpose data, which means they often lack the domain-specific knowledge, tone, and reasoning patterns required for specialised industries like healthcare, law, or finance. Small language models (SLMs), on the other hand, when fine-tuned on domain-specific data, can match or even surpass the task-specific performance of much larger general-purpose models — at a fraction of the inference cost.
Customising SLMs is therefore an increasingly attractive strategy for organisations seeking to balance performance with cost efficiency, latency requirements, and data-privacy concerns.
What we do
humaineeti provides deep capabilities across both the training and the serving side of model customisation.
Post-training customisation
- Instruction tuning
- Alignment — RLHF, DPO, GRPO
- Task-specific fine-tuning
Inference optimization
- Quantization
- Speculative decoding
- Distillation
- Model parallelism
- Kernel optimization
How we deliver — an 8-week model optimization project
We run model optimization engagements on an eight-week timeframe, through the following steps:
1. Discovery & scoping
Define business objectives, success metrics, and performance benchmarks.
2. Data preparation
Curate, clean, and label domain-specific data.
3. Model selection & baselining
Establish baseline performance metrics before customisation begins.
4. Iterative training & optimization
Run training and optimization cycles, measuring against the benchmarks at each iteration.
5. Evaluation & SME review
Metric-based evaluation followed by review from subject-matter experts.
Where it fits
AI Research & Customisation is the Research & Customization tier of our engagement model. Customised models are evaluated by AI Eval Service, governed by Responsible AI controls, and shipped to production through the GenAI Delivery Factory. Pair with RouteIQ to route the right query to your customised model at the right cost.