Blog

Domine a plataforma e otimize seus gastos!

Customizando modelos do Azure Open AI com fine-tuning

Fine-tuning é o processo de ajustar modelos de linguagem para atender necessidades específicas, aprimorando seu desempenho em tarefas direcionadas por meio do treinamento adicional com dados personalizados.

No Azure, fine-tuning com modelos da OpenAI é facilitado por uma plataforma robusta que permite personalizar modelos como GPT para atender às necessidades de diferentes cenários empresariais. Mais detalhes podem ser encontrados na documentação oficial do Azure.

Entre as vantagens do fine-tuning estão: resultados de maior qualidade em comparação com engenharia de prompts, capacidade de treinar com mais exemplos do que o limite máximo de contexto de um modelo, economia de tokens devido a prompts mais curtos e respostas com menor latência, especialmente ao usar modelos menores.

O post de hoje será sobre um app que utilizar um processo de chat completion que irá receber informações sobre currículos de pessoas, irá extrair os principais dados e apresentará os resultados em um formato específico. Hoje utilizaremos o modelo 40-mini do Azure OpenAI.

Aqui temos um exemplo de input:

John Doe, a 28-year-old Frontend Developer with experience at TechCorp and WebSolutions, specializes in delivering responsive, high-performing interfaces using HTML, CSS, and JavaScript. Fluent in both English and Spanish, he excels at collaborating with diverse teams to create clean, maintainable code leveraging frameworks like React and Angular. John’s hard skills include version control (Git), RESTful API integration, and agile methodologies, while his soft skills—such as problem-solving, adaptability, and strong communication—enable him to consistently align development efforts with business objectives and user needs.

O modelo deverá extrair as seguintes informações: resumo do perfil, hard skills, soft skills, empregos e idiomas.

Com isso, o output deverá ser formatado na seguinte forma:

Summary: <short summary about the person>. Hard skills: <hard skills>. Soft skills: <soft skills>. Languages: <languages>. Jobs: <Professional jobs>.

Para o exemplo acima, temos o output:

Summary: John Doe is a 28-year-old Frontend Developer recognized for creating responsive, high-performing user interfaces. Hard skills: HTML, CSS, JavaScript, React, Angular, Git, RESTful API integration, agile methodologies. Soft skills: Problem-solving, adaptability, strong communication. Languages: English, Spanish. Jobs: TechCorp, WebSolutions.

Para iniciar a implementação, vamos criar a pasta raiz para o projeto:

mkdir AzureOpenAiFineTuning

Iremos criar um projeto do tipo console e adicionar o pacote NuGet:

dotnet new console
dotnet add package Azure.AI.OpenAI

Vamos começar definindo um prompt para ser o System message:

const string systemMessage = """
    "You are a helpful RH assistant that create  Input summarizations.
    Your task is to summarize the person's  Input using the follow format:

    'Summary: <short summary about the person>. Hard skills: <hard skills>. Soft skills: <soft skills>. Languages: <languages>. Jobs: <Professional jobs>.'

    ------
    Note: Some of the inputs may not include all information, such as languages or professional jobs. In Summary, you must put 'None' in the value'.
    Note: Outputs only the summarized information. Do not include the input text in the output.
    ---------

    Example 1.
    Input -> John Doe, a 28-year-old Frontend Developer with experience at TechCorp and WebSolutions, specializes in delivering responsive, high-performing interfaces using HTML, CSS, and JavaScript. Fluent in both English and Spanish, he excels at collaborating with diverse teams to create clean, maintainable code leveraging frameworks like React and Angular. John’s hard skills include version control (Git), RESTful API integration, and agile methodologies, while his soft skills—such as problem-solving, adaptability, and strong communication—enable him to consistently align development efforts with business objectives and user needs.
    Output ->  Summary: John Doe is a 28-year-old Frontend Developer recognized for creating responsive, high-performing user interfaces. Hard skills: HTML, CSS, JavaScript, React, Angular, Git, RESTful API integration, agile methodologies. Soft skills: Problem-solving, adaptability, strong communication. Languages: English, Spanish. Jobs: TechCorp, WebSolutions.

    Example 2.
    Input -> Talles Silva, a 30-year-old Backend Developer with experience at Skyline Systems and CodeForge Studios, specializes in building scalable APIs, optimizing database performance, and implementing cloud-based solutions. Proficient in C#, .NET Core, PostgreSQL, and Azure, he delivers efficient, maintainable code while collaborating effectively with diverse teams. Talles excels in problem-solving, leadership, and aligning technical solutions with business goals.
    Output -> Summary: Talles Silva is a 30-year-old Backend Developer recognized for building scalable APIs, optimizing database performance, and implementing cloud-based solutions. Hard skills: C#, .NET Core, PostgreSQL, Azure. Soft skills: Problem-solving, leadership, effective collaboration, alignment with business goals. Languages: None. Jobs: Skyline Systems, CodeForge Studios.

    Example 3.
    Input -> Lucas Ferreira, a 35-year-old Database Administrator with extensive experience at DataSphere Innovations and Nexus Analytics, excels in managing and optimizing database systems for high-performance applications. With deep expertise in PostgreSQL, SQL Server, and MySQL, Lucas has implemented advanced indexing strategies, optimized complex queries, and ensured database scalability to handle enterprise workloads. He is also skilled in managing high-availability clusters, performing seamless data migrations, and automating routine database tasks using scripts and monitoring tools. Fluent in Portuguese, English, and Italian, Lucas works effectively with diverse teams to deliver reliable, secure, and well-documented database environments. His experience extends to integrating cloud databases with platforms like Azure and AWS, leveraging tools such as Power BI for actionable analytics, and ensuring compliance with data governance standards. Known for his problem-solving abilities and proactive approach, Lucas consistently aligns database solutions with organizational objectives, enhancing operational efficiency and data reliability.
    Output ->  Summary: Lucas Ferreira is a 35-year-old Database Administrator recognized for managing and optimizing database systems to ensure high performance and reliability. Hard skills: PostgreSQL, SQL Server, MySQL, advanced indexing strategies, complex queries optimization, high-availability clusters management, data migrations, scripting, monitoring tools, Azure, AWS, Power BI, data governance compliance. Soft skills: Problem-solving, proactive approach, effective collaboration. Languages: Portuguese, English, Italian. Jobs: DataSphere Innovations, Nexus Analytics.

    Example 4.
    Input -> Alex Morgan, a 29-year-old Data Scientist with experience at DataPulse Analytics and NextGen Insights, is adept at transforming complex data into actionable insights. Specializing in Python, R, machine learning algorithms, and data visualization, Alex has successfully developed predictive models that increased operational efficiency by 30% and improved customer satisfaction through targeted recommendations. Fluent in English and French, Alex collaborates seamlessly with cross-functional teams to deliver solutions aligned with strategic business objectives. Known for analytical thinking, curiosity, and strong communication skills, Alex drives data-driven decision-making while consistently optimizing processes for scalability.
    Output -> Alex Morgan is a 29-year-old Data Scientist known for developing predictive models and actionable insights. Hard skills: Python, R, machine learning, data visualization. Soft skills: Analytical thinking, curiosity, strong communication. Languages: English, French. Jobs: DataPulse Analytics, NextGen Insights.

    Example 5.
    Input -> Emily Carter, a 41-year-old Project Manager recognized for overseeing large-scale e-commerce implementations, excels in budget management, stakeholder engagement, and risk assessment. With expertise in Agile methodologies, Scrum framework, and software project lifecycle, Emily efficiently leads cross-functional teams to meet deadlines and ensure project success. She is highly effective at resolving conflicts, prioritizing tasks, and aligning project goals with business objectives. Emily’s proactive communication and problem-solving skills result in streamlined processes and improved client satisfaction.
    Output -> Emily Carter is a 41-year-old Project Manager known for successfully leading e-commerce implementations. Hard skills: Agile methodologies, Scrum, software project lifecycle, budget management, risk assessment. Soft skills: Conflict resolution, task prioritization, proactive communication, problem-solving. Languages: None. Jobs: None.
    """;

No prompt acima, utilizamos as seguintes estratégias de prompt engineering:

Definição de persona: ao instruir o modelo para agir como um “RH assistant”, deixamos claro o contexto e o papel que ele deve desempenhar.
Instruções claras e específicas: há a solicitação para resumir o conteúdo e entregá-lo em um formato pré-definido, garantindo consistência no resultado.
Tratamento de informações ausentes: o prompt explicita que, caso alguma informação não exista (como idiomas ou experiências profissionais), deve-se usar “None”.
Few-Shot Prompting: exemplos de entrada e saída demonstram como o texto deve ser sintetizado, orientando o modelo a seguir o mesmo padrão.
Separação entre Input e Output: o prompt instrui a não incluir o texto original na resposta, mantendo foco somente no resumo.
Estilo conciso e direto: as instruções são objetivas, o que facilita a compreensão e a execução da tarefa pelo modelo.

Podemos criar um novo input para ser processado pelo modelo:

var resume = """
             Ethan Reynolds is a 29-year-old Full Stack Developer with over five years of experience in designing, building, and maintaining high-traffic web applications using Node.js, React, and PostgreSQL. He has driven key projects at ByteLeap and CloudGrid, emphasizing clean coding practices, agile collaboration, and user-focused solutions that scale efficiently under peak demands. Known for his strong communication skills, Ethan excels at translating complex technical requirements into clear, actionable strategies while working seamlessly with cross-functional teams. Committed to continuous learning and fluent in English, he stays at the forefront of emerging technologies and industry trends, ensuring that the applications he develops consistently meet evolving business and user needs.
             """;

Com o system message e o novo input prontos, podemos implementar o código que irá utilizar o chat completions para criar o output esperado.

As credenciais estão sendo utilizadas via variáveis de ambiente. Use este link para criar e obter as credenciais do seu modelo (No post de hoje estou utilizando o modelo 40-mini).

string key = Environment.GetEnvironmentVariable("AZURE_OPENAI_API_KEY")!;
string model = Environment.GetEnvironmentVariable("AZURE_OPENAI_API_MODEL")!;
string url = Environment.GetEnvironmentVariable("AZURE_OPENAI_API_URL")!;

AzureOpenAIClient azureClient = new(
    new Uri(url),
    new ApiKeyCredential(key));

ChatClient chatClient = azureClient.GetChatClient(model);

ChatCompletion completion = chatClient.CompleteChat(
[
    new SystemChatMessage(systemMessage),
    new UserChatMessage(resume)
]);

Console.WriteLine($"Output: {completion.Content[0].Text}");
Console.WriteLine($"Total of tokens: {completion.Usage.TotalTokenCount}");

Vamos executar o projeto:

dotnet run

Output: Summary: Ethan Reynolds is a 29-year-old Full Stack Developer recognized for designing and maintaining high-traffic web applications. Hard skills: Node.js, React, PostgreSQL, clean coding practices, agile collaboration. Soft skills: Strong communication, translating technical requirements into strategies, continuous learning. Languages: English. Jobs: ByteLeap, CloudGrid.

Total of tokens: 1325

O resultado foi bem satisfatório. Utilizamos 1325 tokens (input + output) nesse processo!

Vamos ao processo de fine-tuning. Inicialmente, precisaremos de dados para treinamento do nosso modelo (baseado no 40-mini). Os dados devem estar no formato .jsonl, e cada linha deve seguir o padrão do exemplo abaixo:

{"messages": [{"role": "system", "content": "You are a RH specialist. Your task is to receiver a resume and outputs its summarization."}, {"role": "user", "content": "John Doe, a 28-year-old Frontend Developer with experience at TechCorp and WebSolutions, specializes in delivering responsive, high-performing interfaces using HTML, CSS, and JavaScript. Fluent in both English and Spanish, he excels at collaborating with diverse teams to create clean, maintainable code leveraging frameworks like React and Angular. John’s hard skills include version control (Git), RESTful API integration, and agile methodologies, while his soft skills—such as problem-solving, adaptability, and strong communication—enable him to consistently align development efforts with business objectives and user needs."}, {"role": "assistant", "content": "John Doe is a 28-year-old Frontend Developer recognized for creating responsive, high-performing user interfaces. Hard skills: HTML, CSS, JavaScript, React, Angular, Git, RESTful API integration, agile methodologies. Soft skills: Problem-solving, adaptability, strong communication. Languages: English, Spanish. Jobs: TechCorp, WebSolutions."}]}

Mais detalhes sobre o fine-tuning dos modelos da OpenAI podem ser encontrados na documentação oficial.

Os arquivos para treinamento e validação estão disponíveis no repositório deste artigo! Esses arquivos possuem uma série de diferentes exemplos de input e output. Além disso, os prompts que utilizei para criar esses dados também estão disponíveis.

No portal do Azure AI Foundry, acesse o menu de Fine-tuning e clique em "Criar um novo modelo".

Vamos seguir as etapas:

Utilize "ft" como sufixo do modelo;
Faça upload dos arquivos de treinamento e validação disponíveis;
Use os valores defaults para os parâmetros do processo.

Após isso, aguarde o processo de fine-tuning do modelo ser concluído.

Nosso modelo foi treinado com um custo de 52000 tokens. Considerando que o custo do treinamento do 40-mini é de $3.3 por 1 milhão de tokens, o gasto total foi de aproximadamente $0.1716.

Mais detalhes sobre preços podem ser encontrados aqui!

Podemos verificar os resultados do fine-tuning diretamente pelo portal.

Com o modelo pronto, devemos criar um novo deployment.

Após concluir o processo, podemos voltar ao código do projeto e ajustar o systemMessage, removendo exemplos redundantes (marcados em vermelho) que foram utilizados no prompt inicial. Como o modelo já foi treinado com dados consistentes, esses exemplos não são mais necessários.

Aqui precisamos parar um pouquinho para falar sobre few-shot learning X fine-tuning. O few-shot learning utiliza apenas alguns exemplos para orientar um modelo pré-treinado, permitindo adaptação rápida e econômica. Em contrapartida, o fine-tuning envolve ajustar cuidadosamente os parâmetros do modelo com um volume de dados maior. Dessa forma, o modelo desenvolve maior especialização e desempenho consistente. O few-shot é vantajoso em cenários com dados escassos, enquanto o fine-tuning é ideal para requisitos mais complexos.

O novo systemMessage será:

const string systemMessage = """
    "You are a helpful RH assistant that create input summarizations.
    """;

Lembre-se de atualizar o valor da variável de ambiente com o nome do novo modelo treinado!

Execute novamente o código:

dotnet run

Output: Ethan Reynolds is a 29-year-old Full Stack Developer recognized for building and maintaining scalable web applications. Hard skills: Node.js, React, PostgreSQL, agile development. Soft skills: Strong communication, collaboration, problem-solving. Languages: English. Jobs: ByteLeap, CloudGrid.

Total of tokens: 216

A nova abordagem reduziu o consumo total de tokens em 84%!

As respostas apresentaram algumas diferenças, reflexo dos diferentes processos pelos quais os inputs foram submetidos. No caso do modelo com fine-tuning, os resultados são diretamente relacionados à qualidade dos dados de treinamento. Já na abordagem utilizando exemplos no prompt, os resultados são correlacionados com os exemplos apresentados no momento da execução.

Você já pode baixar o projeto por esse link, e não esquece de me seguir no LinkedIn!

Até a próxima, abraços!

Customizando modelos do Azure Open AI com fine-tuning

Gerenciando imagens de máquinas com Azure Compute Gallery: consistência e automação de VMs

Azure Newsletter - 2025-06-30

Guia Básico de Precificação Azure para Startups

Gerenciando imagens de máquinas com Azure Compute Gallery: consistência e automação de VMs

Azure Newsletter - 2025-06-30

Gerenciando imagens de máquinas com Azure Compute Gallery: consistência e automação de VMs