The scalar-on-image regression model examines the relationship between a scalar response variable and a two-dimensional predictor, through the estimation of a bivariate coefficient function $\beta(t,s)$. Traditional scalar-on-image regression models often assume that the coefficient function varies smoothly across the two-dimensional domain. While this smoothness assumption enhances stability, it can limit interpretability, particularly in regions where sparsity (i.e., only specific image regions influencing the response variable) is a critical feature. Despite the wide range of applications requiring sparse and smooth coefficient estimation, methods that simultaneously address both constraints remain limited. In this paper, we propose a robust Generalized Dantzig Selector (GDS) method to estimate the coefficient function in scalar-on-image regression models. Our approach delivers interpretable coefficient estimates by enforcing smoothness to reduce noise and ensure stable estimation while also accurately identifying zero regions in $\beta(t,s)$. These zero regions correspond to image areas that do not influence the response variable, enhancing the model's interpretability. The proposed GDS method demonstrates superior performance compared to existing techniques in both simulations and real data analyses. Furthermore, we provide theoretical support, including non-asymptotic bounds on the estimation error, for the proposed method.