AI in the Workplace
Key Take Home Messages
1. Decide if AI is right for your workflow
- Any technology has inherent risk. It is important that you establish your risk appetite and management strategy.
2. Treat AI as a collaborative assistant, not a replacement
- Frame AI tools (e.g. LLMs, task-automation bots) as “colleagues” that augment human skills.
- Assign them clear, well-defined tasks (data summarisation, first drafts, routine analysis) while preserving thorough human oversight.
3. Establish clear governance and ethical guidelines
- Define which tasks are appropriate for AI, and which always require human judgement (e.g. business decisions, critical reviews).
- Create an “AI ethics checklist” covering transparency, accountability, fairness and compliance with industry regulations.
4. Invest in prompt-crafting and user training
- Teach staff how to write clear, specific prompts (provide context, desired format, examples).
- Share best-practice prompt templates and run short workshops so everyone understands both the capabilities and limits of your chosen AI tools.
5. Validate and verify all AI outputs
- Implement a simple review step for any AI-generated content, especially where accuracy matters (legal, financial, technical).
- Flag potential “hallucinations” or bias, and cross-check facts against trusted sources before publication or decision-making.
6. Maintain data privacy and security
- Assume any data shared with a commercial AI offering could be made public.
- Apply data access and sharing controls, and undertake regular audits to prevent unauthorised exposure of proprietary or personal information.
- Where practical, consider self-hosting sensitive models or use on-premises deployments to ensure data never leaves your environment.
7. Develop action plans for adverse events
- Things can and do go wrong. AI needs to be included as part of your cyber security and incident management strategies.
- Identify potential risks and ensure you have the appropriate plans in place.
Introduction
I live in a bubble (in many ways I suspect). I’ve had artificial intelligence (AI) subscriptions for about three years and had just kind of assumed most people were already leveraging AI to some degree even if only to make humorous images for their friends and family. However, in a couple of recent conversations, I realised this might not be the case. With the hardening of the discussion around AI into two prevailing perspectives—AI can already replace everyone’s job or AI is a waste of time—I thought a real world use case might offer a perspective.
AI is a whole domain of computer science with a long history and with many potential applications. However, what increasingly people think of as AI are large-language models (LLMs). Keeping a narrow focus, I wanted to detail a few of my thoughts on working with LLMs, such as those developed by OpenAI, Deepseek, Meta etc. This isn't a deep dive on how LLMs work (which is beyond me), but rather my experiences. It is hopefully written in such a way that makes it broadly useful.
A large language model (LLM) is a type of artificial intelligence (AI) that acts like a text predictor: you give it some words, and it guesses what comes next. They work, for want of a better term, by voting. Each part of LLM votes for given outputs.
Inside each LLM are thousands, or even billions, of tiny decision makers called nodes or neurons. These nodes are are layered upon each other. Each one has a few simple settings that can be adjusted (bias and weights). When you ask a question, the question is split into small text pieces (like words or parts of words that are called tokens), these are then turned into numbers. Taking the series of numbers, the decision makers vote, layer by layer, on what the next word in a sequence should be. Those votes combine layer by layer until there’s a consensus, and the model outputs your text.
The model learns by practice: during training it analyses a massive volume of text in a variety of sources (books, articles, websites, code, etc) and repeatedly tries to predict the next word, adjusting its settings to reduce errors over time.
Larger models—with more of the tiny decision makers—can spot subtler patterns, but they also need more computing power. Companies often offer several models that vary in size or topic—some are general‑purpose, others are “fine‑tuned” on legal text, medical research, code, or other specialised material.
A friend once told me that when hiring, it’s more important to choose people you enjoy working with than to focus too narrowly on their existing skills. Of course, this doesn’t apply to every role—you wouldn’t hire a brain surgeon based on personality alone—and most jobs do require a baseline of expertise. Still, the broader point holds: strong working relationships are built on good communication and effective management.
That insight shapes my advice on AI: find a model you like working with and treat it like an employee you're managing. Like any new hire, it needs guidance, clear instructions, and feedback to be effective. But there’s an important caveat—it’s not human.
LLMs, even the newest models that claim to “reason”, do not work like human brains. They’re mathematical models sampling along a curve of data, selecting outputs based on your input. Although this can result in some amusingly human-like replies, those replies are ultimately reflections of probability, not understanding. This means they can generate impressively creative output but also confidently produce incorrect or misleading information (called “hallucinations”). That makes them risky for tasks requiring strict factual accuracy, but valuable when you need speed, creativity, or novel perspectives.
What follows is a jumping-off point for using LLMs. It is by no means exhaustive, but will hopefully provide enough guidance to at least decide if LLMs may be useful for you.
Before Beginning
Is AI right for this task?
Just because AI is accessible doesn’t mean it’s always appropriate. Despite bold claims by some CEOs about replacing entire workforces with AI, we're a long way from AI replacing true expertise. Perhaps the most interesting example is the rapid raise of AI-generated computer code. Increasingly, and despite how lucrative it will be for them, cybersecurity experts are increasingly expressing concern.
Large language models can produce output that looks convincing—even professional—sometimes complete with fabricated references. However, if you lack the subject-matter knowledge to verify that output, or don’t have the time to carefully review it, you shouldn’t be using AI for that task. Also you certainly shouldn’t be talking about replacing employees with it.
There have already been high-profile cases of misuse. In Australia, lawyers have been caught submitting court documents based on fabricated legal citations produced by LLMs. We have also seen a recent scandal with a Government commissioned report. These aren’t harmless mistakes, rather they damage reputations, delay proceedings, and, in regulated professions, carry serious consequences.
Even experienced professionals are being caught out. In the policy world, I’ve noticed a rapid rise in low-quality submissions from groups that once produced strong, considered work. My suspicion? Over reliance on AI for volume over value.
AI is a powerful tool, but it’s not a shortcut to quality. If you're not willing (or able) to review and curate the output, you’re not ready to use it for serious work. AI should not be adopted just to generate slop.
Knowing what terms to put into an internet search engine is critical but this is nothing compared to prompt formulation needed to maximise the potential of AI. Within the scope of large language models (LLMs), a prompt is the text or question you provide to tell the model what you want. In prompting, clear communication is key. Prompts will often resemble instructions you might provide a staff member regarding a particular topic. Likewise, the clearer and more specific your prompt, the better the model can understand your intent and give a useful answer.
There are a lot of guides out there; this Harvard one is still relevant or this Google one that goes into detail. When writing a prompt detail and context is highly important but you can start with simple requests and refine your prompt, and the output, as you go..
What output do I need?
Identifying the required output involves considering straightforward factors—such as topic, length, writing style and format—but you should also think critically about the potential for errors and how those errors might impact your goals. What validation will be required?
You should also think about how you’ll shape the output to align with your goals. Outline a workflow that clearly defines your goals, includes potential prompts, and anticipates points where refinement will be needed. It also helps to list known risks or common mistakes, particularly if you're working with technical or policy content.
Remember, LLM outputs are not destinations but starting points. You should expect to refine, review, and reshape any output through a process of review and iteration. Keeping a version-controlled document during this process not only helps track changes but also provides insight into how your prompting strategy evolves and where things tend to go wrong.
Will people know if I use AI?
Possibly, but that shouldn’t matter if you are using it well. AI should support your expertise, not substitute for it. If you’re relying on it to cover for gaps in your skills or knowledge, then it’s not the right tool or you’re not in the right role.
There are certain hallmarks that suggest AI involvement. One is the overuse of em dashes (—), which pains me personally as I have a strong preference for their use. Another clue is formulaic phrasing, like “in conclusion,” or tell-tale errors such as “As a large language model, I am…” making it into the final version. Clumsy or repetitive sentence structures can also raise flags, though high-quality prompts and proper editing often smooth these out.
Several companies offer software that claims to detect AI-generated content, mainly targeted at educators. In my opinion, these tools are unreliable. Unless the output is egregiously unedited, they tend to guess poorly and offer a false sense of security. The recent actions by Australian Catholic University highlight the dangers of this false security.
Who do I trust?
The old adage, if you’re not paying you’re the product being sold, may no longer go far enough. Increasingly, even if you are paying, your data is still being collected, analysed and repurposed.
Depending on your own circumstances one company might be better than another, but no AI provider has a spotless record. It is important to remain vigilant about what you share. Chinese company DeepSeek has drawn criticism for their data practices (sending user data to Chinese servers) and accuracy around specific historical events and policies. Interestingly, DeepSeek broke Nvidia’s near-monopoly by using alternative computer chips for analysis, briefly impacting Nvidia’s stock price. Oddly, other companies seem to have evaded company/government wide bans. For example, Meta (Facebook) has been releasing the prompts users have been using (check your privacy settings); it is best not to even explore the recent activity of Grok (xAI/Twitter); OpenAI no longer views mass manipulation of the population as a risk.
Personally I use models from OpenAI and have been a paid user for quite some time. I have also had short Claude subscriptions. Despite this, I have used it mindfully given much of what I do is sensitive and I’m generally kind of paranoid.
Which model should I use?
AI companies will typically offer a range of models varying in size, speed, training data and cost. You’ll find numerous articles comparing various models at specific tasks but unless your required output aligns closely with one of these tests, such comparisons may not be useful. Model selection can matter, but don’t overthink it. In many cases, it simply comes down to what you have access to.
When looking at models, it is worth remembering that most commercial models are black boxes: you don’t know how they’re trained, what constraints are applied, or exactly how your data is handled. Moreover, even models from the same provider can produce very different responses. A little trial and error is often necessary to find the one that best suits both your style and your task.
I personally liked the ChatGPT o4-mini model as a general-purpose option. However, while writing this article, OpenAI deprecated all of their older models and replaced them with ChatGPT 5. I am now exploring other models, including Claude and Gemini. This unannounced retiring of older models highlights a key issue. A model you are using may be replaced or updated resulting in inconsistency. This is why it is very important that any output is carefully curated, regardless of any previous experiences.
What data should I share?
Although most LLMs claim they don't use your data for training or external sharing, history tells us to be cautious. Supplying a model context and examples, such as previous emails or existing policy positions, will typically greatly improve output. However, one needs to ask what would happen if this data was public or who owns the data to begin with? Data ownership and confidentiality matter. This is particularly important in policy, legal, or client-based work.
In addition, governments and courts are moving towards requiring that LLM store all conversations—even if you delete them. Without unpacking this too much, it can be a significant safety risk depending on where you live or your individual circumstances.
Data breaches should now be thought of as a matter of when, not if. If someone gained access to your account, or broke into your provider’s server, reflect on what that would mean for you and what you have shared. Especially coupled with regulations preventing the deletion of data. Finally, it is important to remember that the conversations with your LLM are not privileged. The application data can be subpoenaed.
Risk tolerance will vary, but a simple principle applies, if privacy matters, assume everything you share could one day be seen. In highly risk-adverse cases consider using self-hosted or locally run models or if AI is even appropriate for a given task.
Self-hosting an LLM means running the model on your own servers or computers instead of relying on a third-party or cloud service. This provides full control over the model’s configuration, updates and data storage, ensuring that any prompts or outputs never leave your infrastructure. Self-hosting requires that you provide sufficient/suitable hardware, manage the LLM software, apply security patches and handle backups yourself. In return, you gain greater data privacy, longer-term cost predictability and the freedom to adapt the model as your projects evolve.
While this does provide a significant boost to security, it remains impractical for most workplaces due to the cost and technical expertise required. Moreover, you may have to rely on older or smaller LLMs.
Potential Uses
While AI tools can be applied in countless ways—from generating recipes to troubleshooting obscure electronics—this overview focuses on how LLMs can support professional policy work. This is a non‑exhaustive guide. Your own experiences may vary depending on the task, the tool, and the care taken in reviewing outputs.
Consolidating, rephrasing & proof reading
This is most likely my primary use of LLMs. This is not a substitution for you rereading a document. However, I make a LOT of minor errors and typos in my writing and when proofing I tend to read over them unless there is significant time between writing and reading. The use of LLM has greatly helped this but, at the risk of sounding like a broken record, be very mindful about what you share.
You can ask a LLM to review a whole document, or just a single sentence. Moreover, your prompting can greatly shape the type of response you build. Your prompting may look like this:
I have included a paragraph below from an email to a senior government official. I am writing as the Government Relations Officer from a small NGO dedicated to resident concerns in the Adelaide Hills Council. We are concerned about the lack of commitment to the State Government principles described in their policy position <link>. The paragraph is just a polite ask reinforcing a previously raised concern. However, I worry the tone is incorrect. The official is very senior. I want to make sure the writing is professional and in line with Australian best practices. I would like to be friendly but make it clear that the matter has not been resolved. Could you please help? <insert paragraph>
The following sentence sounds clumsy to me. Can you please reword it for me so that it flows better? <insert sentence>
Please read the following letter. I want to write it for an older audience (60+ years old). What would you change? Would you mind explaining why you would change it? <insert letter>
It is possible to be highly specific in prompting to aid in obtaining your desired outcome. You can also completely reshape writing. One of my favourite aspects is asking why an AI might suggest a specific change. This allows me to reflect on my writing and perhaps ensure that the next piece is on target.
Document summary
LLM should never replace your expertise. However, in policy work, timing matters. Large policy documents, legislative changes or court decisions can suddenly drop. A quick summary can greatly aid posturing, allow you to be first out of the gate, identify parts for deeper review, or rapidly identify novel issues that you might not normally consider. It is not uncommon to scan documents for key words of interest. AI summaries can be used the same way.
Although a rapid summary is helpful, great care is needed to ensure any conclusions are real. To start exploring potential issues it is worth asking the LLM questions such as:
You suggested the document highlights failures in the government review process. What parts of the document do so? Can you provide me with the page number and a few sentences for each point that highlights this?
Your point #5 relating to this document, regarding the duty of care, seems to contradict the work by Jones et al (link). Does the document in question represent a paradigm shift?
These follow‑ups serve both to focus your review and verify AI‑generated claims.
Ideas
Writer’s block can halt progress. While reflection can be valuable, breaking the procrastination cycle often requires a spark. LLMs can act as a sounding board to explore topics, angles, or creative approaches you might not have considered. Your LLM prompts might resemble:
Provide me with social media ideas to support the advocacy for comfy couches.
Is there a link between afternoon naps and productivity? Could you provide any evidence that supports a relationship (either positive or negative)?
In each case you might follow up with additional prompts to refine the responses or further explore particular options.
Draft creation
Creating a first draft can be a daunting task but sometimes all you need is a few words on a page to get you started. LLMs excel at breaking through that initial barrier. By turning a prompt into a structured starting point, you have something tangible to work with. This will not be final, but you also won’t be staring at a blank page. This makes LLMs powerful for overcoming inertia, generating alternative phrasings, or exploring multiple directions quickly.
The key is to see AI as a springboard, not a substitute—a partner that accelerates your thinking and helps you get to the real work faster.
The Process
Start simple
Simple prompts, especially early on, can provide you with a great starting point. If you’re unsure exactly the type of output you’re after, start with something straight forward or simple. For example:
Write me 1000 words on the importance of nutrient fortification in food. Include citations.
I am writing a lengthy report on mushroom cultivation. Write me a draft, including all headings that I should cover on this topic.
Starting simple helps you see how the model interprets your request before you invest time in refining details.
Use repetition to explore alternatives
Asking the LLM to regenerate a response, or running the same prompt multiple times, can produce different angles, structures, or emphases. While sometimes repetitive, it can surface a better‑framed explanation or highlight options you hadn’t considered.
Look for a “Regenerate” or “Try Again” button in your chosen platform.
Refine through feedback
LLMs improve when you give clear, targeted feedback. Often the initial offering(s) is lacking but with clear targeted feedback, you can refine the output. This might include:
This is too academic. Please make it more accessible to a non‑technical audience.
Add examples from Australian policy to support the argument.
Think of the AI as a junior colleague—the clearer your feedback, the better the next draft.
Be highly specific from the start
Detailed prompts save time by reducing the need for back‑and‑forth refinement. It is possible to include:
Audience (e.g. politicians, industry leaders, the general public)
Purpose (persuasion, analysis, briefing)
Sources to use (links, citations)
Tone and style (formal, persuasive, concise)
Formatting needs (headings, bullet points, citation style)
A detailed prompt might look like:
You are writing a report on recycling soft plastics for politicians and their staffers. The report needs a very high level executive summary that uses dotpoints that a busy person can get the key take-home messages quickly. The report should draw on this article <insert link>. It should also draw on the language used by the Prime Minister as quoted here <insert link>. The report needs to include the process of soft plastic recycling but it is primarily on the need for investment to support this industry. The benefits have been highlighted here <insert link>. Please include other citations where appropriate. Include a bibliography at the end in the style of the Melbourne Law School Guide to Legal Citation. Also include footnotes in the same style. The report can draw on international information, and will need to for the proven benefits, but make sure the language is Australian centric. Make sure all spellings conform with Australian-English.
Validate & cross-check
Cross‑checking protects your credibility and prevents the inclusion of fabricated or misleading material. Never assume AI output is correct or complete. Always verify citations, facts, and quotations and rely on your own expertise to ensure consistency. Once again, it is important to remember that the AI text acts more like simple suggestions. I would expect that in most situations almost nothing of the AI text would make it into your final version.
Plan & Engage
At the risk of sounding like an 80s public service announcement, talk to your team, colleagues, direct reports, line manger, etc about AI. The more you know. LLMs may not be of use to you or your organisation, however, a clear policy needs to be in place (and really should have been two years ago). In addition, ensure the AI policy is incorporated into your personal or business risk procedures and crisis management strategies. AI is a powerful tool, but also exposes risks that you might not be familiar with.
AI Policy is Not Set & Forget
Develop a clear AI policy but treat it as a living document. Technologies evolve quickly, and so should your rules of engagement. Establish a regular review cycle to assess new risks, opportunities, and legal or regulatory changes. In addition, engagement with staff should not be a tick box on a form. Even if you are working by yourself or using AI for recipes, it is important to regularly revisit your policies and the tools you are employing.
Define, Own & Teach AI Use
Define permissible use cases. Spelling out where AI can and cannot be used in your work will allow you to more effectively manage the potential risks. For example: drafting early‑stage discussion papers, summarising meeting transcripts, or generating first‑pass policy options may be acceptable; automated decision‑making without human oversight would not be.
This needs to also clarify accountability and authorship. Make explicit who is responsible for verifying outputs, documenting AI‑assisted work, and ensuring compliance with relevant laws or organisational standards. This not only reduces reputational risk but also provides clarity in the event of disputes.
By providing training to staff so they understand both the capabilities and limitations of AI tools, you will build capability and awareness. This includes prompt‑engineering basics, bias awareness, and effective verification techniques. A workforce that knows why and how to use AI will make fewer costly mistakes.
Plan for Disaster
Your crisis management strategy and cyber security policy must include AI‑specific scenarios. This includes unauthorised disclosure of confidential material to a public model, publication of AI‑generated misinformation, or a reliance on faulty outputs in a policy recommendation. Pre‑planning will help you act decisively when mistakes happen.