fbpx

AI Accuracy

Last week’s post was about AI Productivity. I thought it was a simple example, but it turned out not to be the case. I asked Gemini how far it was from our 8th floor terrace on the Atlantic Ocean to the horizon. Gemini’s response was so clear. It explained the logic, the formula, and the calculations with a result of 2.5 miles. It looked so straightforward, I did not challenge it. I should have.

Two readers, who have experience at sea, said 2.5 miles was not even close to the correct answer to my question. One of the two said, based on experience, the answer should be more like 11 miles. The other reader sent me a page from the Nautical Almanac which put the answer at 10.5 nautical miles (12 statue miles).

I sent the responses to Gemini. It said it was interesting that two experts you consulted provided a much larger distance (10-12 miles) for the horizon compared to the 2.5 miles we calculated. There are a couple of possibilities here. It’s possible there might be a misunderstanding in the information you received from the experts. The 2.5-mile distance is a well-established calculation based on the Earth’s curvature and your height above sea level. While atmospheric refraction can cause the horizon to appear slightly farther, it wouldn’t be a difference of several miles. And “perhaps the experts were considering a different scenario”. For example, it said if you were at a much higher altitude, like on top of a mountain or in a plane, the horizon would indeed be farther away. When I referred to the Nautical Almanac, Gemini said that’s an interesting detail! While nautical almanacs are excellent resources for maritime navigation, the information about the horizon distance in this case might be due to a misunderstanding because nautical almanacs typically focus on providing tables and data for celestial navigation. This includes information about the positions of stars, planets, and the Sun at various times throughout the year.

The more I went back to Gemini, the more ridiculous and defensive it became. Clearly, arithmetic is not a strong suit for Gemini. I finally threw in the towel and told Gemini I had done so. It said, “There’s no excuse for these mistakes. I apologize for the confusion and frustration they caused”.

AI shows both strengths and weaknesses when comparing its abilities in text and arithmetic. My experience has been AI is great with text. Not always but usually. On the other hand, I have found AI to be weak and inaccurate when it comes to numbers, arithmetic, and calculations. The distance to the horizon is the most extreme case I have experienced and the first time I have found the AI to be defensive.

To consider strengths and weaknesses in the area of text and understanding language nuances, AI can analyze vast amounts of text to understand complex grammar structures, semantics, and cultural references. This capability allows AI to translate languages with increasing accuracy, capturing the nuances and context of the original text. AI can condense lengthy documents into concise summaries, highlighting key points. Another strength of AI is sentiment analysis. It can identify the sentiment (positive, negative, neutral) expressed in text, such as when I threw in the towel with the calculation on the horizon. It can also be useful for social media monitoring or customer reviews.

     AI can be creative with text generation. It can generate different creative text formats, like poems, software code and scripts, musical pieces, even different writing styles. Recently, I showed my wife something I wrote with AI assistance. She said it was good but not my style. A strength of AI when it comes to style is you can tell it to re-write something. For example, you can prompt it to make it more formal, less formal, shorter, longer, bulletized, etc. I believe we will soon see an advance whereby you can feed the AI things you (or someone else) have written, and it will learn the style.

AI excels at searching through massive amounts of text to find relevant information based on keywords or concepts. This is crucial for tasks like web search. By leveraging AI to understand your search query, it can provide the most relevant results. If you don’t get exactly what you wanted, you can go into conversation mode. For example, you can say that is not what I wanted. I am looking for something like xyz. I have found conversation is the best way to maximize the value of AI.

When it comes to weaknesses, there are some significant ones. Sometimes the AI just seems to lack common sense reasoning. It can struggle with understanding the deeper meaning or implications of text which relies on common sense or real-world knowledge. Researchers around the world are working very hard to make improvements in this area.

 Probably the most important weakness is bias. When AI models are trained on biased data, it can perpetuate those biases in their outputs. This is a major concern when dealing with sensitive topics like race, gender, politics, or social issues. An area of concern is data poisoning where bad actors intentionally feed large amounts of biased or factually inaccurate data onto the web where it gets picked up by AI models.

When it comes to arithmetic strengths, Gemini touts exceptional speed and accuracy, far exceeding human capabilities. Ha ha. I would agree as AI technology continues to evolve, we can expect it to overcome its current limitations in all areas. In my talks and writing, I will continue to point out pluses and minuses of AI. There are plenty of each. In my book, Robot Attitude: How Robots and Artificial Intelligence Will Make Our Lives Better, I discussed many of the pluses in various industries, especially in healthcare.

Note: I use Gemini AI and other AI chatbots as my research assistants. AI can boost productivity for anyone who creates content. Sometimes I get incorrect data from AI, and when something looks suspicious, I dig deeper. Sometimes the data varies by sources where AI finds it. I take responsibility for my posts and if anyone spots an error, I will appreciate knowing it, and will correct it.