AI u veterinarskoj oftalmologiji: segmentacija slika i veliki jezični modeli u dijagnostici očnih bolesti pasa

Matija Burić

doi:10.36978/cte.9.2.1

Vol. 9 No. 2 (2025), Articles

Vol. 9 No. 2 (2025)

AI in Veterinary Ophthalmology: Image Segmentation and Large Language Models in the Diagnosis of Canine Eye Diseases

Articles

https://doi.org/10.36978/cte.9.2.1

Published 2026-02-23

Matija Burić

Matija Burić

PDF (Hrvatski)

Keywords

U-Net
large language models
image segmentation
veterinary diagnostics
canine eye diseases veterinarska oftalmologija
računalni vid
U-Net
veliki jezični modeli
dijagnostika pasa

How to Cite

Burić, M. (2026). AI in Veterinary Ophthalmology: Image Segmentation and Large Language Models in the Diagnosis of Canine Eye Diseases. Polytechnica, 9(2), 7-16. https://doi.org/10.36978/cte.9.2.1

Abstract

This study presents the application of artificial intelligence methods in veterinary ophthalmology through the combination of image segmentation and language understanding models. A customized U-Net model was used for detecting and segmenting canine ocular symptoms, including ocular opacity, sclera redness, excessive tearing, and colored ocular protrusion. The obtained results were used as input for large language models (GPT-4o, Mistral 7B, Gemini2, Llama-3, and Claude 4) to interpret symptoms and provide preliminary diagnoses. The evaluation was performed using linguistic and semantic metrics (MPNet, MiniLM, BERTScore, CLIPScore, BLEU, METEOR, ROUGE, and SPICE). The results show that integrating U-Net segmentation with LLM analytical capabilities enables effective preliminary diagnosis of canine eye diseases. The ResNet34-based model achieved the highest accuracy in identifying sclera redness, while GPT-4o performed best in interpreting symptoms and suggesting diagnoses. This approach contributes to the development of intelligent systems that can improve the accuracy and efficiency of veterinary diagnostics.

https://doi.org/10.36978/cte.9.2.1

PDF (Hrvatski)

References

Anderson, P., Fernando, B., Johnson, M., & Gould, S. (2016). SPICE: Semantic Propositional Image Caption Evaluation. https://doi.org/10.48550/ARXIV.1607.08822

Anderson, P., Fernando, B., Johnson, M., & Gould, S. Boevé, M., & Stades, F. (1985). Glaucoma in dogs and cats. Review and retrospective evaluation of 421 patients. I. Pathobiological background, classification and breed predisposition. Tijdschrift Voor Diergeneeskunde, 110(6), 219—227.

Azad, R., Aghdam, E. K., Rauland, A., Jia, Y., Avval, A. H., Bozorgpour, A., Karimijafarbigloo, S., Cohen, J. P., Adeli, E., & Merhof, D. (2022). Medical Image Segmentation Review: The success of U-Net. https://doi.org/10.48550/ARXIV.2211.14830

Bucur, A.-M. (2023). Utilizing ChatGPT Generated Data to Retrieve Depression Symptoms from Social Media. https://doi.org/10.48550/ARXIV.2307.02313

Buric, M., Grozdanic, S., & Ivasic-Kos, M. (2024). Diagnosis of ophthalmologic diseases in canines based on images using neural networks for image segmentation. Heliyon, e38287. https://doi.org/10.1016/j.heliyon.2024.e38287

Burić, M., Ivašić-Kos, M., & Grozdanić, S. (n.d.). DogEyeSeg4: Dog Eye Segmentation 4-Class Ophthalmic Disease Dataset (No. urn:nbn:hr:195:405214) [Dataset]. Faculty of Informatics and Digital Technologies, University of Rijeka. Retrieved August 22, 2024, from https://urn.nsk.hr/urn:nbn:hr:195:405214

Burić, M., Paulin, G., & Ivašić-Kos, M. (n.d.). Object Detection Using Synthesized Data. In Proceedings of the ICT Innovations Conference, Ohrid, North Macedonia, 15.

Dandekar, A., Zen, R. A. M., & Bressan, S. (2018). A Comparative Study of Synthetic Dataset Generation Techniques. In S. Hartmann, H. Ma, A. Hameurlain, G. Pernul, & R. R. Wagner (Eds.), Database and Expert Systems Applications (pp. 387–395). Springer International Publishing. https://doi.org/10.1007/978-3-319-98812-2_35

Deane, J., Kearney, S., Kim, K. I., & Cosker, D. (2021). DynaDog+T: A Parametric Animal Model for Synthetic Canine Image Generation (No. arXiv:2107.07330). arXiv. http://arxiv.org/abs/2107.07330

Denkowski, M., & Lavie, A. (2014). Meteor Universal: Language Specific Translation Evaluation for Any Target Language. Proceedings of the Ninth Workshop on Statistical Machine Translation, 376–380. https://doi.org/10.3115/v1/W14-3348

Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. https://doi.org/10.48550/ARXIV.1810.04805

Ganesan, K. (2018). ROUGE 2.0: Updated and Improved Measures for Evaluation of Summarization Tasks. https://doi.org/10.48550/ARXIV.1803.01937

Gemini Team, Reid, M., Savinov, N., Teplyashin, D., Dmitry, Lepikhin, Lillicrap, T., Alayrac, J., Soricut, R., Lazaridou, A., Firat, O., Schrittwieser, J., Antonoglou, I., Anil, R., Borgeaud, S., Dai, A., Millican, K., Dyer, E., Glaese, M., … Vinyals, O. (2024). Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context. https://doi.org/10.48550/ARXIV.2403.05530

González-Chávez, O., Ruiz, G., Moctezuma, D., & Ramirez-delReal, T. A. (2023). Are metrics measuring what they should? An evaluation of image captioning task metrics (No. arXiv:2207.01733). arXiv. http://arxiv.org/abs/2207.01733

Grozdanić, S., Đukić, S., Luzhetskiy, S., Milčić-Matić, N., & Lazić, T. (2020). Atlas bolesti oka pasa i mačaka. Oculus Vet.

He, K., Zhang, X., Ren, S., & Sun, J. (2015). Deep Residual Learning for Image Recognition (No. arXiv:1512.03385; Version 1). arXiv. http://arxiv.org/abs/1512.03385

He, Z., Bhasuran, B., Jin, Q., Tian, S., Hanna, K., Shavor, C., Arguello, L. G., Murray, P., & Lu, Z. (2024). Quality of Answers of Generative Large Language Models vs Peer Patients for Interpreting Lab Test Results for Lay Patients: Evaluation Study. Journal of Medical Internet Research, 26, e56655. https://doi.org/10.2196/56655

Hessel, J., Holtzman, A., Forbes, M., Le Bras, R., & Choi, Y. (2021). CLIPScore: A Reference-free Evaluation Metric for Image Captioning. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 7514–7528. https://doi.org/10.18653/v1/2021.emnlp-main.595

Hrga, I., & Ivasic-Kos, M. (2024). Measuring the Sensitivity of Image Captioning Metrics to Caption Perturbations. In X.-S. Yang, R. S. Sherratt, N. Dey, & A. Joshi (Eds.), Proceedings of Eighth International Congress on Information and Communication Technology (Vol. 696, pp. 1053–1063). Springer Nature Singapore. https://doi.org/10.1007/978-981-99-3236-8_85

Huang, K.-W., Yang, Y.-R., Huang, Z.-H., Liu, Y.-Y., & Lee, S.-H. (2023). Retinal Vascular Image Segmentation Using Improved UNet Based on Residual Module. Bioengineering, 10(6), 722. https://doi.org/10.3390/bioengineering10060722

Jiang, A. Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D. S., Casas, D. de las, Bressand, F., Lengyel, G., Lample, G., Saulnier, L., Lavaud, L. R., Lachaux, M.-A., Stock, P., Scao, T. L., Lavril, T., Wang, T., Lacroix, T., & Sayed, W. E. (2023). Mistral 7B. https://doi.org/10.48550/ARXIV.2310.06825

Johnsen, D. A. J., Maggs, D. J., & Kass, P. H. (2006). Evaluation of risk factors for development of secondary glaucoma in dogs: 156 cases (1999–2004). Journal of the American Veterinary Medical Association, 229(8), 1270–1274. https://doi.org/10.2460/javma.229.8.1270

Katic, T., Pavlovski, M., Sekulic, D., & Vucetic, S. (2021). Learning Semi-Structured Representations of Radiology Reports (No. arXiv:2112.10746). arXiv. http://arxiv.org/abs/2112.10746

Papineni, K., Roukos, S., Ward, T., & Zhu, W.-J. (2001). BLEU: A method for automatic evaluation of machine translation. Proceedings of the 40th Annual Meeting on Association for Computational Linguistics - ACL ’02, 311. https://doi.org/10.3115/1073083.1073135

PBC, A. (2024). Claude LLM (Version 3) [Computer software]. Anthropic PBC.

Petit, O., Thome, N., Rambour, C., & Soler, L. (2021). U-Net Transformer: Self and Cross Attention for Medical Image Segmentation. https://doi.org/10.48550/ARXIV.2103.06104

Ronneberger, O., Fischer, P., & Brox, T. (2015). U-Net: Convolutional Networks for Biomedical Image Segmentation (No. arXiv:1505.04597). arXiv. http://arxiv.org/abs/1505.04597

Sasazawa, Y., Yokote, K., Imaichi, O., & Sogawa, Y. (2023). Text Retrieval with Multi-Stage Re-Ranking Models. https://doi.org/10.48550/ARXIV.2311.07994

Savage, C. H., Park, H., Kwak, K., Smith, A. D., Rothenberg, S. A., Parekh, V. S., Doo, F. X., & Yi, P. H. (2024). General-Purpose Large Language Models Versus a Domain-Specific Natural Language Processing Tool for Label Extraction From Chest Radiograph Reports. American Journal of Roentgenology, AJR.23.30573. https://doi.org/10.2214/AJR.23.30573

Simonyan, K., & Zisserman, A. (2015). Very Deep Convolutional Networks for Large-Scale Image Recognition (No. arXiv:1409.1556). arXiv. http://arxiv.org/abs/1409.1556

Song, K., Tan, X., Qin, T., Lu, J., & Liu, T.-Y. (2020). MPNet: Masked and Permuted Pre-training for Language Understanding. https://doi.org/10.48550/ARXIV.2004.09297

Sreng, S., Maneerat, N., Hamamoto, K., & Win, K. Y. (2020). Deep Learning for Optic Disc Segmentation and Glaucoma Diagnosis on Retinal Images. Applied Sciences, 10(14), 4916. https://doi.org/10.3390/app10144916

Strom, A. R., Hässig, M., Iburg, T. M., & Spiess, B. M. (2011). Epidemiology of canine glaucoma presented to University of Zurich from 1995 to 2009. Part 1: Congenital and primary glaucoma (4 and 123 cases). Veterinary Ophthalmology, 14(2), 121–126. https://doi.org/10.1111/j.1463-5224.2010.00855.x

Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., & Wojna, Z. (2015). Rethinking the Inception Architecture for Computer Vision (No. arXiv:1512.00567; Version 3). arXiv. http://arxiv.org/abs/1512.00567

Tan, M., & Le, Q. V. (2020). EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks (No. arXiv:1905.11946). arXiv. http://arxiv.org/abs/1905.11946

Thamizharasan, A., Murugan, M. S., & Parthiban, S. (2016). Surgical Management of Cherry Eye in a Dog. Intas Polivet, 17(II), 420-421.

Thirunavukarasu, A. J., Mahmood, S., Malem, A., Foster, W. P., Sanghera, R., Hassan, R., Zhou, S., Wong, S. W., Wong, Y. L., Chong, Y. J., Shakeel, A., Chang, Y.-H., Tan, B. K. J., Jain, N., Tan, T. F., Rauz, S., Ting, D. S. W., & Ting, D. S. J. (2024). Large language models approach expert-level clinical knowledge and reasoning in ophthalmology: A head-to-head cross-sectional study. PLOS Digital Health, 3(4), e0000341. https://doi.org/10.1371/journal.pdig.0000341

Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., Bikel, D., Blecher, L., Ferrer, C. C., Chen, M., Cucurull, G., Esiobu, D., Fernandes, J., Fu, J., Fu, W., … Scialom, T. (2023). Llama-3 2: Open Foundation and Fine-Tuned Chat Models (No. arXiv:2307.09288). arXiv. http://arxiv.org/abs/2307.09288

Tripathi, R. M., Kashyap, D. K., & Giri, D. K. (2014). Surgical Management of Cherry Eye in a Dog. Intas Polivet, 15(1), 131-132.

Wang, W., Wei, F., Dong, L., Bao, H., Yang, N., & Zhou, M. (2020). MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers. https://doi.org/10.48550/ARXIV.2002.10957

Zhang, T., Kishore, V., Wu, F., Weinberger, K. Q., & Artzi, Y. (2019). BERTScore: Evaluating Text Generation with BERT. https://doi.org/10.48550/ARXIV.1904.09675

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

AI in Veterinary Ophthalmology: Image Segmentation and Large Language Models in the Diagnosis of Canine Eye Diseases

Keywords

How to Cite

Download Citation

Abstract

References