The difference between \(L_1\) and \(L_2\) regularization is well studied in ML (and you can find lots of reference on the internet). However, the most common explanation is perhaps via this figure.
While it's true (and seemingly intuitive at first), this figure may not be as easy to digest. The figure corresponds to an equivalent constrained optimization problem, but why we need to deal with constrained optimization problem is not clear . In fact, someone asked this question on Quora: What's a good way to provide intuition as to why the lasso (L1 regularization) results in sparse weight vectors?, which motivated me to provide a simpler explanation .