Book Review Sentiment Analysis

A model comparing Baseline and LSTM approaches for analyzing Amazon Kindle book reviews.
We filtered reviews down to 21,605 sentences, classified them as positive/negative/neutral, and visualized the results using LIME graphs. เป็นโมเดลที่ทำการเปรียบเทียบระหว่างการใช้ baseline และ LSTM มาวิเคราะห์รีวิวหนังสือจาก Amazon Kindle
โดยเรานำรีวิวทั้งหมดมากรองคำออกมาได้ 21,605 ประโยคแล้วแยกรีวิวเป็นเชิงบวก/ลบ และเป็นกลางโดยจะสรุปผลออกมาในรูปแบบกราฟ Lime

Information

GitHub

Preprocessing

Started with data cleaning, removing reviews that were too short, contained null values, or had incorrect sentiment labels, resulting in 21,684 usable rows. Created input in "aspect sentence" format to help the model identify which aspect to analyze. เริ่มที่การทำความสะอาดข้อมูล เช่นตัดรีวิวที่สั้นเกินไป มีค่าว่าง หรือ sentiment ที่ไม่ถูกต้องออกทำให้เหลือข้อมูลพร้อมใช้งานเพียง 21,684 แถว และสร้าง Input ในรูปแบบ aspect sentence เพื่อช่วยให้โมเดลรู้ว่าต้องวิเคราะห์เป็น Aspect(ด้าน)ใดๆ

LSTM Model

Our main model is Bidirectional LSTM, chosen for its ability to understand sentence context. We added Class Weights for Neutral and Negative classes to address data imbalance, achieving an accuracy of 67%. โดยโมเดลหลักของเราคือ Bidirectional LSTM เพื่อให้เข้าใจบริบทของประโยค และเพิ่ม Class Weights ให้ Neutral และ Negative เพื่อแก้ปัญหาข้อมูลไม่สมดุล โดยได้ค่า Accuracy ที่67%

Baseline Model

Our baseline is TF-IDF + Logistic Regression, a simple yet highly effective model yielding the highest accuracy in the project at 68%. This demonstrates that while Logistic Regression is slightly more accurate, Bi-LSTM offers better contextual understanding, especially in sentences with mixed sentiments. Baseline ของเราคือ TF-IDF + Logistic Regression เป็นโมเดลง่ายแต่มีประสิทธิภาพดีมาก และให้ Accuracy สูงที่สุดในโปรเจกต์ที่ 68% ซึ่งจะแสดงให้เห็นว่าแม้ Logistic Regression จะได้ความแม่นยำที่สูงกว่าเล็กน้อย แต่ Bi-LSTM จะมีความสามารถเชิงเข้าใจบริบทที่ละเอียดกว่า โดยเฉพาะในประโยคที่มี Sentiment ผสมกัน

Case 1-3

The first 3 cases involve sentences with consistent sentiment. In non-conflicting cases, both models performed 100% correctly without issues, showing they capture keywords like "amazing" or "terrible" well. However, LSTM provided significantly higher confidence scores. โดย 3 เคสแรกเป็นประโยคที่มี Sentiment ตรงกัน โดยในเคสที่มีผลไม่ขัดแย้งกันได้ผลเป็นทั้งสองโมเดลทำถูก 100% ไม่มีปัญหา แสดงว่าโมเดลจับคำสำคัญ เช่น amazing หรือ terrible ได้ดี แต่ LSTM ให้คะแนนความมั่นใจสูงกว่าอย่างเห็นได้ชัด

Case 4

In conflicting sentence cases, the Baseline model failed completely because the presence of both positive and negative words averaged out to Neutral. LSTM, while not perfect, correctly paired "amazing" with "plot", demonstrating better contextual understanding than the baseline. ในเคสที่ขัดแย้งกัน sentence โมเดล Baseline จะผิดทั้งหมด เพราะเวลามีทั้งคำบวกและลบ จะเฉลี่ยออกมากลายเป็น Neutral ส่วน LSTM แม้ไม่ได้ถูกทั้งหมด แต่สามารถจับคู่คำว่า amazing กับ plot ได้ถูกต้อง ถือว่าความเข้าใจเชิงบริบทดีกว่า baseline

Conclusion

We found that words like "amazing" and "boring" clearly impact the results, though LSTM still has some bias from positive words at the beginning of sentences. In conclusion, although Logistic Regression had 1% higher accuracy, Bi-LSTM is more suitable for ABSA due to its understanding of conjunctions, long-distance dependencies, and complex sentences. จะพบว่าคำว่า amazing และ boring มีผลต่อผลลัพธ์ชัดเจน แต่ LSTM ยังติด bias จากคำ positive ที่อยู่ต้นประโยค ทำให้บางครั้งทายผิดครับ สรุปคือ แม้ Logistic Regression จะให้ Accuracy สูงกว่า 1% แต่ Bi-LSTM เหมาะสมกว่าในการทำ ABSA เพราะเข้าใจคำเชื่อม ความสัมพันธ์ระยะไกล และแยกแยะประโยคที่ซับซ้อนมากกว่า