大言语模型(LLM)已成为一种工具,从回答疑问到生成义务列表,它们在许多方面简化了咱们的上班。如今团体和企业曾经经常使用LLM来协助成功上班。
代码生成和评价最近曾经成为许多商业产品提供的关键性能,以协助开发人员处置代码。LLM还可以进一步用于处置数据迷信上班,尤其是模型选用和实验。
本文将讨论如何将智能化用于模型选用和实验。
借助LLM成功模型选用和实验智能化
咱们将设置用于模型训练的数据集和用于智能化的代码。在这个例子中,咱们将经常使用来自Kaggle的。以下是我为预处置环节所做的预备。
import pandas as pddf = pd.read_csv('fraud_data.csv')df = df.drop(['trans_date_trans_time', 'merchant', 'dob', 'trans_num', 'merch_lat', 'merch_long'], axis =1)df = df.dropna().reset_index(drop = True)df.to_csv('fraud_data.csv', index = False)
咱们将只经常使用一些数据集,摈弃一切缺失的数据。这不是最优的环节,但咱们关注的是模型选用和实验。
接上去,咱们将为咱们的名目预备一个文件夹,将所无关系文件放在那里。首先,咱们将为环境创立requirements.txt文件。你可以用上方的软件包来填充它们。
openaipandasscikit-learnpyyaml
接上去,咱们将为所无关系的元数据经常使用YAML文件。这将包括OpenAI API密钥、要测试的模型、评价度量目的和数据集的位置。
llm_api_key: "YOUR-OPENAI-API-KEY"default_models:- LogisticRegression- DecisionTreeClassifier- RandomForestClassifiermetrics: ["accuracy", "precision", "recall", "f1_score"]dataset_path: "fraud_data.csv"
而后,咱们导入这个环节中经常使用的软件包。咱们将依托Scikit-Learn用于建模环节,并经常使用OpenAI的GPT-4作为LLM。
import pandas as pdimport yamlimport astimport reimport sklearnfrom openai import OpenAIfrom sklearn.linear_model import LogisticRegressionfrom sklearn.tree import DecisionTreeClassifierfrom sklearn.ensemble import RandomForestClassifierfrom sklearn.model_selection import train_test_splitfrom sklearn.preprocessing import LabelEncoderfrom sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
此外,咱们将设置辅佐(helper)函数和消息来协助该环节。从数据集加载到数据预处置,性能加载器在如下的函数中。
model_mapping = {"LogisticRegression": LogisticRegression,"DecisionTreeClassifier": DecisionTreeClassifier,"RandomForestClassifier": RandomForestClassifier}def load_config(config_path='config.yaml'):with open(config_path, 'r') as file:config = yaml.safe_load(file)return configdef load_data(dataset_path):return pd.read_csv(dataset_path)def preprocess_data(df):label_encoders = {}for column in df.select_dtypes(include=['object']).columns:le = LabelEncoder()df[column] = le.fit_transform(df[column])label_encoders[column] = lereturn df, label_encoders
在同一个文件中,咱们将LLM设置为表演机器学习角色的专家。咱们将经常使用上方的代码来启动它。
def call_llm(prompt, api_key):client = OpenAI(api_key=api_key)response = client.chat.completions.create(model="gpt-4",messages=[{"role": "system", "content": "You are an expert in machine learning and able to evaluate the model well."},{"role": "user", "content": prompt}])return response.choices[0].message.content.strip()
你可以将LLM模型更改为所需的模型,比如来自HuggingFace的开源模型,但咱们倡导暂且保持经常使用OpenAI。
我将在上方的代码中预备一个函数来清算LLM结果。这确保了输入可以用于模型选用和实验步骤的后续环节。
def clean_hyperparameter_suggestion(suggestion):pattern = r'\{.*?\}'match = re.search(pattern, suggestion, re.DOTALL)if match:cleaned_suggestion = match.group(0)return cleaned_suggestionelse:print("Could not find a dictionary in the hyperparameter suggestion.")return Nonedef extract_model_name(llm_response, available_models):for model in available_models:pattern = r'\b' + re.escape(model) + r'\b'if re.search(pattern, llm_response, re.IGNORECASE):return modelreturn Nonedef validate_hyperparameters(model_class, hyperparameters):valid_params = model_class().get_params()invalid_params = []for param, value in hyperparameters.items():if param not in valid_params:invalid_params.append(param)else:if param == 'max_features' and value == 'auto':print(f"Invalid value for parameter '{param}': '{value}'")invalid_params.append(param)if invalid_params:print(f"Invalid hyperparameters for {model_class.__name__}: {invalid_params}")return Falsereturn Truedef correct_hyperparameters(hyperparameters, model_name):corrected = Falseif model_name == "RandomForestClassifier":if 'max_features' in hyperparameters and hyperparameters['max_features'] == 'auto':print("Correcting 'max_features' from 'auto' to 'sqrt' for RandomForestClassifier.")hyperparameters['max_features'] = 'sqrt'corrected = Truereturn hyperparameters, corrected
而后,咱们将须要该函数来启动模型和评价训练环节。上方的代码将用于经过接受宰割器数据集、咱们要映射的模型称号以及超参数来训练模型。结果将是度量目的和模型对象。
def train_and_evaluate(X_train, X_test, y_train, y_test, model_name, hyperparameters=None):if model_name not in model_mapping:print(f"Valid model names are: {list(model_mapping.keys())}")return None, Nonemodel_class = model_mapping.get(model_name)try:if hyperparameters:hyperparameters, corrected = correct_hyperparameters(hyperparameters, model_name)if not validate_hyperparameters(model_class, hyperparameters):return None, Nonemodel = model_class(**hyperparameters)else:model = model_class()except Exception as e:print(f"Error instantiating model with hyperparameters: {e}")return None, Nonetry:model.fit(X_train, y_train)except Exception as e:print(f"Error during model fitting: {e}")return None, Noney_pred = model.predict(X_test)metrics = {"accuracy": accuracy_score(y_test, y_pred),"precision": precision_score(y_test, y_pred, average='weighted', zero_division=0),"recall": recall_score(y_test, y_pred, average='weighted', zero_division=0),"f1_score": f1_score(y_test, y_pred, average='weighted', zero_division=0)}return metrics, model
预备就绪后,咱们就可以设置智能化环节了。有几个步骤咱们可以成功智能化,其中包括:
1.训练和评价一切模型
2. LLM选用最佳模型
3. 审核最佳模型的超参数调优
4. 假设LLM倡导,智能运转超参数调优
def run_llm_based_model_selection_experiment(df, config):#Model TrainingX = df.drop("is_fraud", axis=1)y = df["is_fraud"]X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)available_models = config['default_models']model_performance = {}for model_name in available_models:print(f"Training model: {model_name}")metrics, _ = train_and_evaluate(X_train, X_test, y_train, y_test, model_name)model_performance[model_name] = metricsprint(f"Model: {model_name} | Metrics: {metrics}")#LLM selecting the best modelsklearn_version = sklearn.__version__prompt = (f"I have trained the following models with these metrics: {model_performance}. ""Which model should I select based on the best performance?")best_model_response = call_llm(prompt, config['llm_api_key'])print(f"LLM response for best model selection:\n{best_model_response}")best_model = extract_model_name(best_model_response, available_models)if not best_model:print("Error: Could not extract a valid model name from LLM response.")returnprint(f"LLM selected the best model: {best_model}")#Check for hyperparameter tuningprompt_tuning = (f"The selected model is {best_model}. Can you suggest hyperparameters for better performance? ""Please provide them in Python dictionary format, like {'max_depth': 5, 'min_samples_split': 4}. "f"Ensure that all suggested hyperparameters are valid for scikit-learn version {sklearn_version}, ""and avoid using deprecated or invalid values such as 'max_features': 'auto'. ""Don't provide any explanation or return in any other format.")tuning_suggestion = call_llm(prompt_tuning, config['llm_api_key'])print(f"Hyperparameter tuning suggestion received:\n{tuning_suggestion}")cleaned_suggestion = clean_hyperparameter_suggestion(tuning_suggestion)if cleaned_suggestion is None:suggested_params = Noneelse:try:suggested_params = ast.literal_eval(cleaned_suggestion)if not isinstance(suggested_params, dict):print("Hyperparameter suggestion is not a valid dictionary.")suggested_params = Noneexcept (ValueError, SyntaxError) as e:print(f"Error parsing hyperparameter suggestion: {e}")suggested_params = None#Automatically run hyperparameter tuning if suggestedif suggested_params:print(f"Running {best_model} with suggested hyperparameters: {suggested_params}")tuned_metrics, _ = train_and_evaluate(X_train, X_test, y_train, y_test, best_model, hyperparameters=suggested_params)print(f"Metrics after tuning: {tuned_metrics}")else:print("No valid hyperparameters were provided for tuning.")
在上方的代码中,我指定了LLM如何依据实验评价咱们的每个模型。咱们经常使用以下揭示依据模型的性能来选用要经常使用的模型。
prompt = (f"I have trained the following models with these metrics: {model_performance}. ""Which model should I select based on the best performance?")
你一直可以更改揭示,以成功模型选用的不同规定。
一旦选用了最佳模型,我将经常使用以下揭示来倡导应该经常使用哪些超参数用于后续环节。我还指定了Scikit-Learn版本,由于超参数因版本的不同而有变动。
prompt_tuning = (f"The selected model is {best_model}. Can you suggest hyperparameters for better performance? ""Please provide them in Python dictionary format, like {'max_depth': 5, 'min_samples_split': 4}. "f"Ensure that all suggested hyperparameters are valid for scikit-learn version {sklearn_version}, ""and avoid using deprecated or invalid values such as 'max_features': 'auto'. ""Don't provide any explanation or return in any other format.")
你可以以任何想要的方式更改揭示,比如经过更大胆地尝试调优超参数,或减少另一种技术。
我把上方的一切代码放在一个名为automated_model_llm.py的文件中。最后,减少以下代码以运转整个环节。
def main():config = load_config()df = load_data(config['dataset_path'])df, _ = preprocess_data(df)run_llm_based_model_selection_experiment(df, config)if __name__ == "__main__":main()
一旦一切预备就绪,你就可以运转以下代码来口头代码。
python automated_model_llm.py
输入:
LLM selected the best model: RandomForestClassifierHyperparameter tuning suggestion received:{'n_estimators': 100,'max_depth': None,'min_samples_split': 2,'min_samples_leaf': 1,'max_features': 'sqrt','bootstrap': True}Running RandomForestClassifier with suggested hyperparameters: {'n_estimators': 100, 'max_depth': None, 'min_samples_split': 2, 'min_samples_leaf': 1, 'max_features': 'sqrt', 'bootstrap': True}Metrics after tuning: {'accuracy': 0.9730041532071989, 'precision': 0.9722907483489197, 'recall': 0.9730041532071989, 'f1_score': 0.9724045530119824}
这是我实验失掉的示例输入。它或者和你的不一样。你可以设置揭示和生成参数,以取得愈加多变或严厉的LLM输入。但是,假设你正确构建了代码的结构,可以将LLM运用于模型选用和实验智能化。
论断
LLM曾经运行于许多经常使用场景,包括代码生成。经过运用LLM(比如OpenAI GPT模型),咱们就很容易委派LLM处置模型选用和实验这项义务,只需咱们正确地构建输入的结构。在本例中,咱们经常使用样本数据集对模型启动实验,让LLM选用和实验以改良模型。
原文题目: Model Selection and Experimentation Automation with LLMs 作者:Cornellius Yudha Wijaya