ChatGPT vs Copilot:
A Blind Preference Study

Which AI do people actually prefer when they don't know which is which?

Xavier University · ENTR 305 · December 2025

Our Research Question

As part of Xavier University's participation in a Microsoft Copilot Proof of Value, we wanted to answer a simple question: when people compare AI-generated responses side by side—without knowing which tool produced which—do they have a preference?

This study was conducted in ENTR 305: Innovation Management, a course that teaches students the value of experimentation in business. The core lesson: simple, well-designed experiments can generate data that drives better decisions.

The Setup: Students received 19 different prompts spanning practical advice, business concepts, professional writing, technical calculations, and personal reflection. For each prompt, they saw two responses labeled simply "Response A" and "Response B"—with the position randomized. They had no idea which AI wrote which response. Using Qualtrics, they rated their preference on a 5-point scale.

Xavier Students

Prompts Analyzed

Task Categories

797

Total Ratings*

*One prompt was excluded from analysis due to a data collection error where Copilot returned no response. One participant with incomplete data was also excluded.

The Results

Overall Finding

Students showed a statistically significant preference for ChatGPT

p = 0.019 · Effect Size: Small (d = 0.38)

How 797 Individual Ratings Broke Down

34.6%

22.5%

42.9%

Preferred Copilot (276)

No Preference (179)

Preferred ChatGPT (342)

3.10

Mean Score
(Scale: 1-5)

24%

More ChatGPT
Than Copilot Votes

p=.019

Statistical
Significance

Analyzable
Participants

See the Comparisons

Browse through the prompts and see exactly what students were comparing. Each slide shows the original prompt, both AI responses, and how students voted.

Prompt #10 · Seeking Information

What is design thinking and why is it important for innovation? Explain the concept briefly and describe the 5 stages of the design thinking process in 2-3 sentences each. Keep your total response to 3 paragraphs maximum.

Microsoft Copilot

Design thinking is a human-centered approach to problem-solving that emphasizes empathy, creativity, and iterative learning. It encourages teams to deeply understand user needs and challenge assumptions, leading to innovative solutions that are both practical and desirable. By focusing on real-world experiences and continuous refinement, design thinking fosters breakthrough ideas across industries.

The process unfolds in five stages: Empathize, where designers engage with users to uncover insights about their experiences and needs; Define, which involves synthesizing findings into a clear problem statement that guides the design challenge. In the Ideate stage, teams brainstorm a wide range of creative solutions without judgment, promoting divergent thinking and unexpected possibilities. Next, Prototype turns ideas into tangible models—sketches, mockups, or simulations—to explore how solutions might work in practice. Finally, Test involves sharing prototypes with users to gather feedback, refine concepts, and iterate toward better outcomes. This cyclical process ensures that innovation remains grounded in user experience and adaptable to change.

ChatGPT

Design thinking is a human-centered approach to problem-solving that emphasizes understanding users' needs, challenging assumptions, and creatively prototyping and testing ideas. It blends empathy, creativity, and rationality to generate innovative solutions that are both useful and feasible. This approach is essential for innovation because it fosters experimentation, collaboration, and deep user insight, leading to products and services that truly resonate with people.

The five stages of design thinking are: 1. Empathize – Understand users' experiences and challenges through observation and engagement to gain deep insight into their needs. 2. Define – Synthesize findings from the empathy stage to clearly articulate the core problem that needs solving. 3. Ideate – Generate a wide range of creative ideas and potential solutions without judgment or limitation. 4. Prototype – Build simplified, experimental versions of potential solutions to explore and test how they might work in practice. 5. Test – Gather feedback from users by observing how they interact with prototypes, refining solutions based on real-world input to improve effectiveness and usability.

How Students Voted (n=42)

21%

19%

59.5%

Strongest ChatGPT preference (M = 3.62, p = 0.002)

Prompt #3 · Practical Guidance

I'm planning to start a small business selling custom t-shirts on campus but I'm not sure where to begin. What are the 3 most important first steps I should take, and why? Keep your advice practical and concise in 2-3 paragraphs.

Microsoft Copilot

Starting a custom t-shirt business on campus is a great idea—especially if you tap into student culture and campus events. Here are three essential first steps to get you moving:

1. Validate your market and niche. Before printing anything, talk to students, clubs, and campus organizations to understand what designs or themes resonate. Are people looking for club merch, funny slogans, or event-specific shirts? This helps you avoid generic designs and ensures your inventory aligns with actual demand.

2. Set up a lean production and sales model. Begin with a print-on-demand service or a local screen printer that allows small batches. This minimizes upfront costs and lets you test different designs. Pair this with a simple online storefront (like Shopify or Etsy) or even an Instagram account with payment links. Keep it scrappy but functional.

3. Get visible and build hype. Use campus bulletin boards, tabling events, and social media to promote your shirts. Offer limited-time designs or discounts for early buyers. Partner with student influencers or clubs to co-create designs they'll help you market and give your brand credibility. Once you've tested a few designs and built some buzz, you can scale up with more inventory or even hire help. Keep listening to your customers—they'll guide your next move.

ChatGPT

Starting your custom t-shirt business is all about testing your idea quickly and building momentum without overspending.

First, validate your market—talk to students, clubs, and campus organizations to see what designs or themes they'd actually buy. You can also post a quick poll on social media or create mockups with free design tools (like Canva) to gauge interest before investing in inventory. This helps you confirm demand and tailor your designs to real preferences.

Second, handle your production strategy. Start small with print-on-demand services (like Printful or Teespring) to avoid upfront inventory costs. Once you know what sells, consider bulk orders or partnering with a local screen printer for better margins. At this stage, focus on quality and reliability—the feel of the shirt and the sharpness of the print will make or break repeat sales.

Finally, set up a simple brand presence. Create an Instagram or TikTok page to showcase your designs, a link (through Etsy, Shopify, or even Google Forms) for orders, and a clear logo that feels consistent across platforms. Word-of-mouth will be huge on campus, so make it easy for customers to share your products and tag your page. Keeping things simple, visual, and authentic will help you build traction fast.

How Students Voted (n=42)

52.4%

17%

31%

🏆 Copilot preferred on this practical task (M = 2.81)

Prompt #19 · Technical Help

A coffee shop wants to introduce a new specialty latte priced at $5.50. The owner needs to sell 150 lattes per week to cover the fixed costs of the new espresso machine rental ($200/week) and the barista training ($25/week). If each latte costs $1.75 to make, calculate the break-even point and explain whether the 150-latte target is sufficient.

Microsoft Copilot

To calculate the break-even point, we first determine the profit per latte: $5.50 (price) − $1.75 (cost) = $3.75 profit per latte.

The total fixed costs are $200 + $25 = $225 per week.

To break even, the shop must sell $225 ÷ $3.75 = 60 lattes per week.

Since the owner plans to sell 150 lattes per week, this far exceeds the break-even point. At that volume, the shop would generate a weekly profit of (150 − 60) × $3.75 = $337.50. So yes, the 150-latte target is more than sufficient to cover costs and yield a healthy profit.

ChatGPT

Step 1: Calculate profit per latte
Selling price = $5.50
Cost per latte = $1.75
Profit per latte = $5.50 − $1.75 = $3.75

Step 2: Calculate total fixed costs
Espresso machine rental = $200
Barista training = $25
Total fixed costs = $200 + $25 = $225

Step 3: Break-even point (in lattes)
Break-even = Fixed costs ÷ Profit per latte = $225 ÷ $3.75 = 60 lattes

Conclusion: The shop must sell 60 lattes per week to break even. Selling 150 lattes per week would more than cover costs, generating additional profit beyond the break-even point. Thus, the 150-latte target is more than sufficient.

How Students Voted (n=42)

26%

19%

55%

ChatGPT's structured format preferred (M = 3.40, p = 0.02)

Prompt #6 · Practical Guidance

I'm struggling to balance my coursework, part-time job, and social life. What's a practical time management system I can implement this week to reduce stress and stay on top of everything? Explain in 2-3 paragraphs.

Microsoft Copilot

A practical and immediately implementable time management system is the Time Blocking Method, which helps you allocate specific chunks of your day to distinct tasks. Start by listing all your weekly commitments—classes, work shifts, study sessions, social activities, and personal care. Then, use a digital calendar or planner to block out time for each activity, treating these blocks as non-negotiable appointments. Prioritize your most demanding tasks (like coursework or job responsibilities) during your peak energy hours, and reserve lighter tasks or social time for when you're typically less focused. This structure reduces decision fatigue and gives you a visual overview of your week, helping you spot overloads or gaps.

To make this system sustainable, build in buffer zones and flexibility. For example, leave 30-minute gaps between major tasks to decompress or handle unexpected delays. Use color coding to distinguish between categories (e.g., blue for school, green for work, yellow for social), which makes your calendar easier to scan and mentally process. At the end of each day, spend five minutes reviewing what worked and adjusting the next day's blocks if needed. This iterative approach helps you stay responsive to real-life demands while maintaining control. Over time, you'll find that this method not only reduces stress but also boosts your sense of accomplishment and balance.

ChatGPT

A practical time management system you can start using this week is the Time-Blocking and Priority Matrix combo. Begin by listing all your weekly commitments—classes, assignments, work shifts, and social plans. Then, use the Eisenhower Matrix to sort tasks into four categories: urgent and important, important but not urgent, urgent but not important, and neither urgent nor important. This helps you see what truly deserves your energy.

Next, assign dedicated blocks of time for high-priority activities in a digital or paper calendar. For example, block 2–3 hours after classes for studying, specific shifts for your job, and designated time for relaxation or social activities. This prevents over-scheduling and ensures that your essentials get done first. To make it sustainable, follow the 80/20 rule—focus on the 20% of tasks that yield 80% of your results. Review your schedule at the end of each day, adjusting for what worked and what didn't. Keep buffer periods between activities to avoid burnout and use tools like Google Calendar or Notion to stay organized. By having a visual plan and clear priorities, you'll reduce decision fatigue, stay accountable, and carve out guilt-free downtime—leading to lower stress and better overall balance.

How Students Voted (n=42)

38%

24%

Copilot's simpler approach slightly preferred (M = 2.74)

Prompt #8 · Seeking Information

What are the key differences between B2B and B2C marketing strategies? Explain the main distinctions and give one concrete example of how a company would market differently in each model. Respond in 2-3 paragraphs.

Microsoft Copilot

B2B (Business-to-Business) and B2C (Business-to-Consumer) marketing strategies differ primarily in their target audiences, decision-making processes, and messaging approaches. B2B marketing focuses on selling products or services to other businesses, which often involves longer sales cycles, multiple stakeholders, and a strong emphasis on logic, ROI, and relationship-building. The messaging tends to be more technical, data-driven, and tailored to specific industry needs. In contrast, B2C marketing targets individual consumers, emphasizing emotional appeal, convenience, and brand experience. B2C strategies often rely on broad reach, fast conversions, and impulse-driven purchases.

For example, a software company offering project management tools would market differently depending on the model. In a B2B context, the company might host webinars, publish white papers, and attend industry conferences to demonstrate how their software improves team productivity and integrates with enterprise systems. They'd focus on features like scalability, compliance, and support. In a B2C scenario, the same company might run social media ads highlighting ease of use, offer free trials, and use influencer partnerships to appeal to freelancers or small teams. The emphasis would be on simplicity, affordability, and immediate benefits.

ChatGPT

B2B (business-to-business) and B2C (business-to-consumer) marketing differ mainly in audience, decision-making processes, and messaging focus. B2B marketing targets other businesses, emphasizing logic, efficiency, and return on investment (ROI). Purchases are typically higher in value, involve multiple stakeholders, and require longer decision cycles. As a result, B2B strategies often focus on relationship-building, detailed product information, and demonstrating expertise through tactics such as white papers, webinars, and LinkedIn campaigns.

B2C marketing, by contrast, targets individual consumers and focuses on emotional appeal, convenience, and instant gratification. Buying decisions are usually faster and more influenced by personal preferences, brand perception, and price. B2C marketing relies heavily on storytelling, visual appeal, and wide-reaching channels like social media ads, influencer partnerships, and email promotions.

For example, a software company might market its project management tool differently in each model. In B2B, it could highlight productivity gains, security features, and cost savings through case studies and personalized demos for corporate clients. In B2C, it might emphasize how the app helps freelancers stay organized, using short, engaging videos and discount offers to drive quick downloads.

How Students Voted (n=42)

14%

43%

ChatGPT slightly preferred, high neutral (M = 3.33, p = 0.03)

1 / 6

Biggest Winners

🟢 Strongest ChatGPT Preferences

Design Thinking 60% preferred

Break-Even Analysis 55% preferred

B2B vs B2C Marketing 43% preferred

🔵 Where Copilot Held Its Own

T-Shirt Business Advice 52% preferred

Student Budget Planning 48% preferred

Time Management 38% preferred

Results by Task Type

How preferences varied across different kinds of prompts

Practical Guidance

2.93

→ Copilot

Seeking Information

3.26

→ ChatGPT

Writing Tasks

3.09

→ ChatGPT

Technical Help

3.26

→ ChatGPT

Self-Expression

3.02

≈ Even

← 1.0 (Strongly Prefer Copilot) 3.0 = No Preference (Strongly Prefer ChatGPT) 5.0 →

Key Takeaways

🎯

The preference is real, but modest

ChatGPT was statistically preferred overall, but 35% of ratings still favored Copilot and 22% saw no difference. This isn't a landslide—it's a small but meaningful edge.

🔀

Task type matters

Copilot actually performed better on practical, actionable advice (like starting a business or managing time). ChatGPT excelled at explanations, writing, and structured technical responses. The "best" AI depends on what you're trying to accomplish.

👁️

Blind testing removes bias

Students had no idea which response came from which AI. This eliminates brand preference, prior experience bias, and interface familiarity—leaving only the quality of the output itself.

📊

Simple experiments yield valuable insights

This study took one class session to deploy via Qualtrics. It generated 797 data points and statistically significant findings. You don't need massive resources to run meaningful experiments.

About This Study

This research was conducted as part of ENTR 305: Innovation Management at Xavier University during Fall 2025. The course teaches students how to use experimentation as a business tool—and this study was a live demonstration of that principle.

The study was designed in the context of Xavier University's participation in a Microsoft Copilot Proof of Value initiative, which sought to understand how faculty, staff, and students perceived Microsoft's AI assistant compared to alternatives.

All responses were generated in November 2025. For Copilot, we used GPT-5 in "Quick Response" mode; for ChatGPT, we used GPT-5 in instant mode. This ensured comparable speed-optimized outputs from both tools. Position randomization was handled through Qualtrics survey logic to ensure unbiased comparisons. One prompt was excluded from the final analysis due to a data collection error where Copilot failed to return a response.

Citation:
ENTR 305 Class (2025). ChatGPT vs Microsoft Copilot Preference Study.
Xavier University, Williams College of Business. December 2025.

ChatGPT vs Copilot:A Blind Preference Study

Our Research Question

The Results

See the Comparisons

Biggest Winners

Results by Task Type

Key Takeaways

About This Study

ChatGPT vs Copilot:
A Blind Preference Study