Not to mention you're listening through a phone. The vocals are upfront while the beat is in the back which probably isnt how it would sound in reality. So I can understand why it sounds busy. It's almost like too many layers and too much harmonizing but if you take in to context how you're listening....
Point is a snippet only shares but so much.