Instruction-Based Social Media Caption Evaluation and Enhancement Dataset

Published: 27 April 2026| Version 1 | DOI: 10.17632/fz28c2ghv4.1
Contributors:
,
,
,
,
,

Description

This dataset contains 1,698 records, consisting of social media advertising captions in Arabic and descriptions of products sold online, all of which is evaluated in detail by experts. It is specifically designed to help in NLP (natural language processing) research, especially in Arabic text generation, copywriting evaluation (content writing), and sentiment analysis in marketing. We collected 1,698 captures of real and active ads on Facebook (collected manually by copying and pasting). Its goal is to evaluate the quality of written advertisements, and provide improved versions of them that will attract the customer and generate serious interaction. How was the evaluation done? (Our standards) Each caption was comprehensively analyzed, and the evaluation was divided into a point system (out of 100) distributed as follows: - Planning (P): 15 points - Interaction (E): 20 points - Quality (Q): 20 points - Reach and CTA (R): 20 points - Influence (I): 25 points Output Structure: Each row in the data will show you the following: - Score: X/100 (based on the distribution above). - Why?: Two sentences explaining the reason for the evaluation and the problems in the original caption. - Two quick edits: practical tips to get the ad right. - Improved version: After the caption has been refreshed and is ready, it can be downloaded and sold. How did we work? (Methodology & Tools): In order to get the work done with this accuracy, we relied on more than one evaluation tool, and our maestro was Generative AI Models. But because the AI ​​sometimes hallucinates, all evaluations and improved versions were done under complete human supervision and careful review by us, in order to ensure that the words are 100% logical and that there are no “hallucinations” appearing in the results. Potential Use Cases: - Fine-tuning Large Language Models (LLMs) for Arabic marketing text generation. - Training models for automated copywriting assessment and scoring.

Files

Institutions

Categories

e-Commerce Marketing, Social Media Marketing, Digital Marketing

Licence