My usual method for this kind of photography:
I shoot while walking, phone by my side in one hand and bluetooth trigger (CamKix) in the other. Sometimes when I go out without the trigger, I use the volume buttons to snap the shot, but 1) that usually causes the phone to move even more and often spoils the shot, and 2) if I hold the phone so that I can get a finger or thumb on the volume buttons, I often end up with my blurry fingers in the image.
Since hip shooting is blind shooting, I always use the wide-angle lens to give myself the best chance of getting the subject in the frame. Sometimes I add the Moment wide-angle to the stock lens, which gives me an even wider capture area, about the same as the ultra-wide on the new iPhones, I think. An added benefit of the Moment lens is that I can hold the phone with my fingers behind the lens to keep them out of the picture.
Also, my captures almost always come out tilted, no matter how much I try to be aware of my hand position, so wide-angle lens gives me extra room for rotating the image back to straight without chopping off anything important. (Yes, yes, I confess my personal preference is to have horizons, streets, buildings, etc., be level and plumb, unless the off-axis presentation is there to make some aesthetic point.)
I almost always use the iPhone native camera app and shoot JPGs for these kinds of photos. I used to use burst mode until Apple changed the trigger for it, so now I snap off repeated shots with the bluetooth trigger. The 3-second Live photos would be another option, but that usually doesn't work out as well for me.