cerebras.modelzoo.data_preparation.data_preprocessing.hooks.finetuning_llava_hook#

cerebras.modelzoo.data_preparation.data_preprocessing.hooks.finetuning_llava_hook(example, **read_hook_kwargs)[source]#

Transforms conversation data for finetuning LLaVA.

Parameters
  • example (Dict[str, Any]) – The input data containing conversation and image paths.

  • **read_hook_kwargs (Any) – Additional keyword arguments containing data_keys, system_prompt, image_token, multi_turn_content_key, and phase.

Returns

Transformed data suitable for finetuning LLaVA.

Return type

List[Dict[str, Any]]

Raises
  • AssertionError – If required keys are not provided in read_hook_kwargs.

  • ValueError – If image_token is not provided, or if there are multiple image tokens in the user’s role, or if image tokens are found in the assistant’s response.